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JOAN  ELIZABETH  CARLETTA 


Methodologies  for  built-in  self- test  (BIST)  insertion  in  VLSI  circuits  are  presented  for 
three  different  levels  of  design  abstraction.  The  methodologies  are  designed  to  be  used 
during  the  design  flow  of  application  specific  integrated  circuits  (ASICs),  which  starts 
at  the  algorithmic  level  in  the  behavioral  domain  and  moves  to  the  register  transfer 
level  in  the  structural  domain,  and  finally  to  the  gate  level  in  the  structural  dommn.  At 
each  level,  the  methodology  is  based  on  the  use  of  testability  metrics  to  identify  and 
remove  points  of  low  testability  in  a  circuit.  By  quantifying  the  properties  that  make  a 
BIST  scheme  successful,  the  testability  metrics  provide  a  way  to  measure  fest  quality 
implicitly,  without  resorting  to  fault  simulation,  which  is  both  expensive  and  not  avail¬ 
able  at  the  higher  levels  of  design  abstraction.  The  testability  metrics  are  computed 
using  a  Markov  chain  model. 


11 


Fault  coverage  curves  show  that  when  the  BIST  insertion  methodologies  are  applied, 
the  resulting  circuits  are  significantly  easier  to  test  than  circuits  designed  without 
regard  to  testability.  Wherever  fault  coverage  results  are  given,  layout  areas,  transistor 
counts,  and  critical  delays  are  also  given  so  that  the  trade-off  between  a  circuit’s  test¬ 
ability  and  its  area  and  performance  can  be  fully  appreciated. 

Examples  of  our  insertion  methodologies  employ  three  different  BIST  schemes:  con¬ 
ventional  BIST,  circular  BIST,  and  the  circular  self-test  path  technique.  For  circular 
BIST  and  the  circular  self-test  path  technique,  special  care  must  be  taken  when  adding 
the  test  circuitry  to  a  design.  This  work  explores  the  problems  that  can  occur,  and  out¬ 
lines  structural  constraints  that  should  be  followed  to  avoid  the  problems. 

The  three  BIST  insertion  methodologies  introduced  complement  one  another,  rather 
than  compete  with  one  another.  Although  it  is  desirable  to  consider  testability  as  early 
as  possible  in  the  design  flow,  the  algorithmic  level  methodology  does  not  supercede 
the  register  transfer  and  gate  level  methodologies.  We  show  that  each  methodology 
has  its  own  area  of  application. 
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Chapter  1 


Introduction 


As  very  large  scale  integration  (VLSI)  technology  has  progressed,  and  it  has  become 
possible  to  put  more  and  more  transistors  on  a  single  integrated  circuit,  testing  a  circuit 
to  see  whether  it  is  functioning  properly  has  become  an  increasingly  difficult  task.  A 
variety  of  techniques  for  alleviating  the  testing  problem  have  been  developed.  This 
dissertation  focuses  on  one  increasingly  popular  technique,  built-in  self-test  (BIST). 
The  main  idea  behind  BIST  is  to  put  everything  that  is  needed  to  test  the  circuit 
directly  on  the  chip,  so  that  the  circuit  is  capable  of  testing  itself.  The  main  thrust  of 
the  dissertation  is  the  development  of  methodologies  for  automatic  insertion  of  BIST 
into  VLSI  application  specific  integrated  circuits  (ASICs);  given  a  circuit,  we  would 
like  to  automatically  modily  the  circuit,  adding  the  circuitry  that  is  necessary  to  per¬ 
form  a  built-in  self- test.  As  we  do  this,  we  must  keep  in  mind  that  the  addition  of  cir¬ 
cuitry  to  the  chip,  while  making  the  chip  easier  to  test,  can  also  have  the  ill  effects  of 
making  the  chip  larger  in  area  (and  therefore  higher  in  cost)  and  slower  in  speed. 

Before  we  can  devise  methodologies  for  BIST  insertion,  we  must  have  a  clear  under¬ 
standing  of  how  ASICs  are  designed.  Contemporary  ASIC  design  is  a  process  consist- 
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ing  of  many  steps,  during  which  a  design  flows  from  one  level  of  design  abstraction  to 
another.  Typically,  a  design  begins  as  an  “algorithmic”  description,  almost  like  a  com¬ 
puter  program,  that  specifies  the  function  that  the  circuit  should  perfomi.  At  this  level, 
the  design  is  a  single  “black  box”;  we  know  what  the  circuit  should  do,  but  not  how  it 
should  do  it.  As  the  design  moves  from  level  to  level,  synthesis  tools  are  used  to  fill  in 
more  and  more  of  the  details  of  how  the  circuit  should  be  implemented.  Early  on,  we 
may  view  the  circuit  as  an  interconnection  of  complex  hardware  units  like  adders  and 
multipliers  (each  still  a  “black  box”);  later,  the  circuit  will  become  a  netlist  of  gates, 
and  ultimately  it  will  be  expressed  in  terms  of  layout. 

Built-in  self- test  insertion  can  be  done  at  any  point  during  the  ASIC  design  flow.  Gen¬ 
erally  speaking,  it  is  best  to  begin  thinking  about  testing  early  on  in  the  design  flow, 
since  how  a  circuit  is  designed  has  a  profound  impact  on  how  easy  it  is  to  test.  Once 
low  level  decisions  about  a  circuit  implementation  have  been  made,  it  may  be  too  late 
to  easily  incorporate  testability;  making  modifications  for  the  sake  of  testability  may 
require  major  changes  in  the  design.  However,  as  we  shall  see  later,  there  are  times 
when  we  can  not  begin  testability  consideration  at  the  highest  levels  of  design  abstrac¬ 
tion.  Therefore,  there  is  need  for  BIST  insertion  methodologies  that  operate  over  the 
whole  range  of  design  abstraction  levels. 
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1.1  Problem  Definition 

The  goal  of  this  work  is  the  development  of  methodologies  for  automatic  BIST  inser¬ 
tion,  in  coordination  with  design  synthesis  tools.  In  this  work,  we  describe  three  sepa¬ 
rate  methodologies  for  BIST  insertion,  operating  at  three  different  levels  of  design 
abstraction,  and  therefore  at  three  different  points  in  the  ASIC  design  flow.  The  first 
works  with  logic  level  design  descriptions.  The  second,  at  a  higher  level,  works  with 
register  transfer  level  descriptions.  The  third  works  at  the  highest  level  of  the  ASIC 
design  flow;  it  operates  on  algorithmic  level  design  descriptions.  Despite  differences 
in  design  level,  the  methodologies  have  many  similarities.  All  take  as  input  a  design 
description,  and  add  BIST  features  to  that  design  description  so  that  when  the  descrip¬ 
tion  is  synthesized  to  create  a  physical  implementation,  that  physical  circuit  can  be 
easily  tested.  Furthermore,  all  three  methodologies  keep  in  mind  four  goals: 

•  to  minimize  the  effect  of  the  BIST  insertion  on  the  overall  area. 

•  to  minimize  the  effect  of  the  BIST  insertion  on  the  system  performance. 

•  to  maximize  the  fault  coverage  that  can  be  obtained  during  test. 

•  to  minimize  the  test  time  required  to  obtain  that  level  of  fault  coverage. 

All  use  a  unified  set  of  testability  metrics  to  quantify  the  testability  of  the  original  cir¬ 
cuit  and  identify  trouble  spots  in  the  design.  These  metrics  provide  a  mechanism  for 
trading  off  area  overhead  and  system  performance  against  obtainable  fault  coverage 
and  required  test  time. 

We  now  briefly  describe  the  approach  for  each  of  the  three  methodologies. 
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1.1.1  Logic  level  BIST  insertion 

The  first  methodology  operates  on  logic  level  sequential  circuits.  It  starts  by  assuming 
a  minimal  BIST  paradigm,  in  which  the  primary  inputs  of  a  circuit  are  driven  by  a  test 
pattern  generation  register  (TPGR)  that  provides  stimuli  for  the  circuit,  and  the  pri¬ 
mary  outputs  are  fed  into  a  multiple  input  shift  register  (MISR)  that  analyzes  the 
responses  of  the  circuit  to  the  stimuli.  Thus,  the  primary  inputs  are  assumed  controlla¬ 
ble,  and  the  primary  outputs  are  assumed  observable.  It  then  computes  the  testability 
metrics  for  the  internal  signals  of  the  circuit;  signals  with  low  testability  metrics  desig¬ 
nate  possible  trouble  spots  when  testing  the  physical  circuit.  A  two  step  approach  first 
improves  the  testability  metrics  for  the  flip-flops,  so  that  each  combinational  logic 
block  receives  high  quality  test  patterns,  and  then  improves  the  testability  metrics  of 
internal  signals.  Two  different  kinds  of  insertion,  multiplexer-based  test  point  insertion 
and  circular  BIST  insertion,  are  used.  The  output  of  the  methodology  is  a  logic  level 
circuit  with  added  circuitry  that  is  much  more  easily  tested  than  the  original. 

1.1.2  Register  transfer  level  BIST  insertion 

Our  second  methodology  operates  on  design  descriptions  at  flie  register  transfer  level. 
At  this  level,  circuits  are  described  as  interconnections  of  arithmetic  logic  units 
(ALUs)  and  registers.  Like  our  logic  level  BIST  insertion  methodology,  our  register 
transfer  level  BIST  insertion  methodology  begins  with  a  circuit  with  a  minimal 
amount  of  BIST;  the  only  BIST  added  to  the  circuit  is  that  necessary  to  drive  the  pri¬ 
mary  input  registers  of  the  circuit  with  test  patterns,  and  to  compact  the  responses  from 
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the  primary  output  registers.  The  methodology  then  computes  testability  metrics  for 
the  signals  of  the  circuit.  Here,  the  signals  are  the  width  of  the  words  in  the  datapath. 
The  metrics  values  are  used  to  guide  the  BIST  insertion  by  finding  potential  testability 
problems  in  the  datapath;  thus,  internal  registers  of  the  datapath  are  replaced  with 
BIST  registers  only  when  necessary.  The  output  of  the  methodology  is  a  modified  ver¬ 
sion  of  the  register  transfer  level  circuit,  with  added  circuitry  to  make  the  design  easier 
to  test. 

1.1.3  Algorithmic  level  BIST  insertion 

The  goal  of  our  work  at  the  algorithmic  level^  is  to  develop  a  methodology  for  BIST 
insertion  that  achieves  high  fault  coverage  for  both  the  datapath  and  the  controller  cre¬ 
ated  during  the  ASIC  design  flow.  The  methodology  is  essentially  a  pre-synthesis  pro¬ 
cess  that  is  meant  to  be  independent  of  the  high  level  synthesis  tool  used.  Thus,  our 
algorithmic  level  BIST  insertion  methodology  takes  an  algorithmic  level  design 
description  as  input,  and  modifies  it  to  create  another  algorithmic  level  description 
such  that  when  the  modified  version  is  synthesized  using  the  normal  ASIC  design 
flow,  the  result  is  a  physical  circuit  that  is  easily  tested  using  a  simple  BIST  scheme. 


1.  Some  researchers  prefer  the  term  “behavioral  level”  to  the  term  “algorithmic  level”  used  here. 
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L2  Organization  of  Dissertation 

The  motivation  for  this  work  lies  in  BIST  insertion,  which  is  the  process  of  adding 
BIST  features  to  a  circuit.  We  consider  a  number  of  different  BIST  methodologies, 
including  conventional  BIST,  circular  BIST,  and  the  circular  self-test  path  technique. 
Chapter  Two  provides  background  by  defining  some  fundamental  aspects  of  design- 
for-testability.  Chapter  Three  is  a  survey  of  recent  related  research  in  design-for-test- 
ability,  especially  testability  insertion;  we  cover  insertion  for  both  partial  scan  and 
BIST  because  although  the  two  are  different  applications,  much  of  the  underlying 
motivation  is  the  same.  Chapter  Four  presents  mathematical  definitions  for  the  test¬ 
ability  metrics  that  form  the  cornerstone  of  all  our  BIST  insertion  procedures;  these 
metrics  are  used  to  pinpoint  areas  of  a  design  that  have  testability  problems.  Chapter 
Five  describes  a  Markov  model  that  is  used  to  compute  the  testability  metrics. 

Regardless  of  the  level  of  design  abstraction  at  which  a  BIST  insertion  procedure 
operates,  the  first  step  in  insertion  is  that  of  test  point  selection,  or  deciding  where  to 
place  test  elements  (whether  registers  or  flip-flops)  within  a  circuit.  For  the  circular 
self-test  path  technique,  insertion  requires  a  second  step,  that  of  circular  self-test  path 
formation,  in  which  the  selected  registers  are  chained  together  in  a  particular  order, 
each  with  a  particular  orientation  or  shift  direction.  This  second  step  results  in  some 
difficulties  unique  to  the  circular  self-test  path  technique.  Chapter  Six  describes  these 
difficulties,  and  shows  how  they  can  be  avoided  by  placing  constraints  on  the  structure 
of  the  circular  self-test  path. 
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The  next  three  chapters  provide  specifics  of  our  BIST  insertion  procedures  at  three  dif¬ 
ferent  levels  of  design  abstraction.  Chapter  Seven  demonstrates  our  logic  level  inser¬ 
tion  methodology,  using  a  submodule  of  an  industrial  design  as  both  motivation  and 
example.  Chapter  Eight  discusses  a  similar  BIST  insertion  methodology  at  the  register 
transfer  level,  using  examples  to  demonstrate  how  the  metrics  facilitate  test  point 
selection.  Chapter  Nine  moves  BIST  insertion  into  the  behavioral  domain  by  consider¬ 
ing  testability  at  the  algorithmic  level.  Finally,  Chapter  Ten  presents  conclusions,  and 


points  towards  future  research. 


Chapter  2 


Fundamentals 


This  chapter  provides  background  material  preliminary  to  the  main  work  of  this  dis¬ 
sertation,  and  may  be  safely  skipped  by  readers  already  acquainted  with  the  design  of 
application  specific  integrated  circuits  (ASICs)  and  design-for-testability.  The  chapter 
begins  in  Section  2. 1  with  a  formal  description  of  the  levels  of  abstraction  at  which  a 
design  can  be  described.  A  typical  ASIC  design  flow,  from  concept  to  silicon,  is  pre¬ 
sented  in  Section  2.2.  Section  2.3  describes  the  importance  of  design-for-testability 
(DFT),  which  is  the  consideration  of  testability  in  the  course  of  the  design  flow.  There 
are  the  two  basic  approaches  to  test,  and  deterministic  and  pseudorandom-.  Section  2.4 
compares  them,  outlining  the  advantages  and  disadvantages  of  each.  Design-for-test¬ 
ability  methodologies  specific  to  each  approach  are  introduced;  these  include  both 
scan  methodologies  for  deterministic  test  in  Section  2.5  and  built-in  self-test  method¬ 
ologies  for  pseudorandom  test  in  Section  2.6.  Section  2.7  describes  how  design-for- 
testability  can  be  integrated  with  the  design  flow.  Section  2.8  introduces  two  key  com¬ 
ponents  of  testability,  controllability  and  observability.  Section  2.9  is  a  summary. 
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2.1  Levels  of  Design 

The  levels  of  abstraction  at  which  a  design  can  be  specified  are  illustrated  in  Figure 
2-1,  a  Y-chart  introduced  in  [GaKu83].  The  three  axes  represent  the  three  domains  in 


System  Level 


which  a  design  can  be  specified.  The  behavioral  domain  views  a  design  in  terms  of  its 
function  or  input-output  relationship,  without  any  notion  of  physical  implementation. 
The  structural  domain  views  a  design  as  a  netlist  of  interconnected  components, 
where  each  component  is  a  “black  box”  element  described  behaviorally.  The  physical 
domain  views  a  design  in  terms  of  its  physical  or  geometric  properties.  In  each 
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domain,  a  range  of  levels  of  abstraction  are  possible,  with  die  more  abstract  levels  at 
the  periphery  of  the  chart.  The  points  on  the  axes  are  labelled  with  a  typical  view  of  the 
design  at  the  given  level  and  in  the  given  domain.  For  example,  from  the  Y-chart  we 
can  see  that  a  register  transfer  level  (RTL)  description  of  a  design  in  the  structural 
domain  consists  of  an  interconnection  of  arithmetic  logic  units  (ALUs),  registers,  and 
multiplexers.  If  we  described  this  same  design  at  the  circuit  level  in  the  physical 
domain,  it  would  consist  of  a  number  of  rectangles  and  polygons  in  polysilicon,  metal, 
and  other  materials  used  in  integrated  circuits. 


2.2  ASIC  Design  Flow 

An  ASIC  design  flow  starts  with  a  design  description  at  the  algorithmic  level  in  the 
behavioral  domain,  and  traverses  the  Y-chart  to  create  a  complete  physical  implemen¬ 
tation  for  the  design.  A  typical  path  taken  is  shown  in  Figure  2-2.  The  tools  used  to 
perform  the  traversal  are  as  follows: 

A.  High  Level  Synthesis  takes  as  input  a  design  described  at  the  algorithmic  level  in 
the  behavioral  domain.  This  input  is  an  algorithm  describing  the  function  that  the 
final  circuit  is  intended  to  perform.  The  output  of  high  level  synthesis  is  a  regis¬ 
ter  transfer  level  description  in  the  structural  domain.  This  description  consists  of 
two  parts.  The  first  part  is  a  register  transfer  level  datapath  capable  of  performing 
the  intended  function.  This  datapath  is  described  as  an  interconnection  of  arith¬ 
metic  logic  units,  registers,  and  multiplexers;  each  of  these  interconnected  com- 
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ponents  is  a  “black  box”  described  functionally.  The  second  part  is  a  control 
flow,  usually  in  the  form  of  a  state  diagram,  that  describes  the  way  in  which  the 
datapath  must  be  controlled  in  order  for  it  to  perform  the  intended  function. 

B.  Logic  Level  Synthesis  takes  the  register  transfer  level  datapath  and  transforms  it 
into  a  logic  level  implementation  in  the  structural  domain.  It  does  this  by  filling 
in  the  “black  box”  functional  description  of  each  component  in  the  datapath  with 
a  logic  level  implementation.  At  this  point,  the  control  flow  may  also  be  synthe¬ 
sized  to  the  logic  level  using  a  finite  state  machine  implementation. 

C.  Technology  Mapping  takes  as  input  a  gate  level  design  description  in  the  struc¬ 
tural  domain.  It  moves  the  design  into  the  physical  domain  by  mapping  the  gates 
to  standard  cells  from  a  given  library.  Layout  is  available  for  each  standard  cell. 
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D.  Placement  and  Routing  take  as  input  the  standard  cell  implementation  of  the 
design  in  the  physical  dommn.  The  standard  cells  are  physically  situated,  and  the 
interconnections  between  the  cells  are  routed.  The  result  is  a  layout  of  the  overall 
design  that  is  ready  for  fabrication. 


2,3  Importance  of  Design-for-Testability 

Traditionally,  design  and  test  have  been  considered  two  separate  processes;  one  group 
of  engineers  completed  the  design  of  the  circuit,  and  a  separate  group  was  responsible 
for  building  and  testing  the  circuit.  However,  as  advances  in  VLSI  technology  have 
made  higher  and  higher  package  densities  possible,  this  approach  has  become  less  and 
less  feasible.  Testability  problems  are  caused  primarily  by  an  inability  either  to  control 
a  signal  embedded  within  a  circuit  to  some  required  value,  or  to  observe  the  value  of 
an  embedded  signal.  As  package  density  increases,  the  number  of  gates  per  pin 
increases.  Since  the  internal  parts  of  the  chip  can  be  controlled  and  observed  only  indi¬ 
rectly  through  the  pins,  this  makes  testing  the  chip  more  difficult.  As  a  result,  testabil¬ 
ity  decreases  dramatically,  and  the  effort  spent  in  testing  a  chip  becomes  a  significant 
portion  of  the  overall  cost  of  getting  a  new  design  to  market. 

Design-for-testability  (DPT),  or  the  consideration  of  testability  during  the  design  flow, 
was  introduced  in  the  1970s  as  a  remedy  to  decreasing  testability.  At  the  beginning, 
design-for-testability  involved  designing  a  circuit  so  that  it  had  certain  general  proper¬ 
ties  that  made  it  easier  to  test;  two  such  properties  are  that  the  design  contain  no  asyn- 
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chronous  logic  and  that  it  be  easy  to  initialize  to  a  known  state  [BaMS87].  Later,  more 
structured  DFT  techniques  were  developed.  Because  the  best  choice  of  DFT  technique 
depends  on  the  test  approach  used,  we  now  digress  to  introduce  the  two  main 
approaches  to  test  before  coming  back  to  the  specifics  of  design-for-testability.  When 
we  return  to  design-for-testability,  we  will  describe  specific  DFT  techniques  appropri¬ 
ate  to  each  of  the  approaches. 


2.4  Approaches  to  Test 

The  basic  principle  of  test  is  to  apply  stimuli,  or  test  patterns,  to  the  circuit  under  test, 
and  to  analyze  the  responses  of  the  circuit  to  the  test  patterns  to  determine  whether  the 
circuit  is  responding  as  expected.  Test  methods  can  be  divided  into  two  basic 
approaches,  depending  on  the  type  of  test  pattern  used.  Deterministic  test  applies  a  set 
of  test  patterns  that  are  tailor-made  for  the  circuit  in  question,  while  pseudorandom 
test  applies  pseudorandomly  generated  test  patterns.  Each  approach  has  advantages 
and  disadvantages.  Determining  which  test  patterns  to  include  in  a  deterministic  test  is 
generally  computationally  intensive,  and  requires  an  in-depth  analysis  of  the  circuit. 
This  computation  of  test  patterns  is  usually  a  one-time  cost;  however,  if  even  small 
changes  are  made  to  the  circuit  design,  the  deterministic  test  may  no  longer  be  valid, 
and  so  re-computation  of  the  test  patterns  may  be  necessary.  In  contrast,  computing  the 
test  patterns  for  a  pseudorandom  test  is  simple,  and  the  test  patterns  need  not  be 
changed  when  the  design  is  changed.  Since  the  test  patterns  in  a  deterministic  test  are 
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tailor-made  for  the  circuit  under  test,  they  are  of  high  quality;  a  small  number  of  deter¬ 
ministic  patterns  can  usually  catch  a  large  percentage  of  the  faults  of  the  circuit.  With 
pseudorandom  test,  the  patterns  may  not  be  of  as  high  quality,  so  a  greater  number  of 
patterns  may  be  required. 

Deterministic  and  pseudorandom  test  also  differ  in  the  ways  that  the  test  patterns  are 
typically  generated  during  the  course  of  the  test.  In  general,  it  takes  a  large  amount  of 
hardware  to  generate  a  deterministic  test;  for  this  reason,  the  test  pattern  generation  is 
usually  done  off-chip,  and  is  the  responsibility  of  automatic  test  equipment  (ATE).  In 
contrast,  pseudorandom  test  patterns  can  be  generated  very  simply  with  a  minimal 
amount  of  hardware.  This  makes  pseudorandom  test  amenable  to  built-in  self-test 
(BIST)  techniques  [AgKS93a]  [AgKS93b],  for  which  all  circuitry  needed  to  test  the 
circuit  is  placed  directly  on  the  chip.  A  number  of  different  BIST  methodologies  exist, 
and  will  be  described  later  in  this  chapter.  All  have  in  common  the  addition  of  test  cir¬ 
cuitry  to  the  chip  to  provide  test  patterns  to  the  circuitry  of  the  chip  and  to  analyze  the 
response  of  the  circuitry  of  the  chip  to  the  test  patterns.  BIST  techniques  eliminate  the 
need  for  expensive  ATE;  using  BIST,  the  only  input  that  a  chip  requires  from  its  envi¬ 
ronment  in  the  course  of  the  test  is  an  indication  of  when  the  test  should  begin.  Simi¬ 
larly,  the  chip  gives  only  limited  output  reporting  the  results  of  the  test. 

Because  all  BIST  circuitry  is  on-chip,  test  can  be  done  at  normal  circuit  speed;  in  con¬ 
trast,  with  deterministic  tests  using  ATE,  test  must  usually  be  done  at  a  slower-than- 
normal  rate.  This  means  that  BIST  techniques  have  the  capability  to  detect  timing- 
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related  problems  that  an  off-chip  deterministic  test  may  miss.  Another  advantage  of 
having  all  test  circuitry  directly  on-chip  is  that  the  chip  need  not  be  placed  in  a  special 
environment  (i.e.,  into  the  socket  of  an  ATE)  during  test;  this  means  that  BIST  tech¬ 
niques  are  useful  not  only  for  chip  level  test,  but  also  for  board  level  test.  Since  the 
interaction  of  a  chip  with  its  environment  during  test  is  minimal,  it  is  relatively  simple 
to  place  a  test  controller  on  a  board  that  asks  each  chip  on  the  board  to  test  itself. 

One  disadvantage  of  BIST  techniques  is  the  area  required  on  chip  for  the  test  circuitry. 
The  addition  of  test  circuitry  can  also  cause  some  degradation  in  system  performance, 
even  when  the  chip  is  operating  in  normal  (non-test)  mode.  Thus,  a  key  problem  that 
must  be  solved  in  order  to  make  BIST  practical  is  how  to  add  BIST  features  to  a  circuit 
while  having  a  minimal  impact  on  overall  area  and  performance. 


2.5  Scan  Methodologies  for  Deterministic  Test 

The  problem  of  test  pattern  generation  has  been  fully  solved  only  for  combinational 
logic.  Because  generating  deterministic  test  patterns  for  sequential  circuits  can  be  pro¬ 
hibitively  expensive,  especially  if  those  circuits  have  a  high  degree  of  feedback,  scan 
methodologies  are  commonly  used  to  simplify  the  problem  [AbBF90]  [BaMS87] 
[John89].  Essentially,  scan  serves  to  make  the  internal  flip-flops  of  the  circuit  more 
directly  controllable  and  observable;  this  is  accomplished  by  adding  an  operating 
mode  to  the  circuit  in  which  the  flip-flops  form  a  simple  shift  register  or  scan  chain. 
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For  thejull  scan  methodology,  all  internal  flip-flops  are  made  accessible  by  placing 
them  in  the  scan  chain.  Since  all  flip-flops  are  now  accessible,  the  problem  of  generat¬ 
ing  test  patterns  for  the  circuit  degenerates  to  that  of  generating  test  patterns  for  the 
combinational  logic  of  the  circuit.  A  specific  test  pattern  is  applied  to  the  combina¬ 
tional  logic  by  placing  the  scan  chain  in  shift  mode  and  shifting  in  the  test  pattern. 
Then,  the  scan  chain  is  placed  in  normal  mode,  so  that  the  circuit  operates  according  to 
its  designed  function;  when  the  circuit  is  clocked  in  normal  mode,  the  test  pattern  is 
applied  to  the  logic.  At  this  point,  the  response  of  the  combinational  logic  to  the  test 
pattern  is  clocked  into  the  scan  chain;  by  putting  the  scan  chain  back  into  shift  mode, 
the  response  can  be  shifted  out  and  observed  directly. 

The  partial  scan  methodology  improves  on  the  high  hardware  overhead  of  full  scan  by 
placing  only  a  subset  of  the  flip-flops  of  the  circuit  in  the  scan  chain.  A  variety  of  tech¬ 
niques  exist  for  deciding  which  of  the  flip-flops  should  be  placed  in  the  scan  chain; 
some  will  be  described  in  Chapter  Three,  which  contains  a  survey  of  related  research. 


2.6  BIST  Methodologies  for  Pseudorandom  Test 

All  BIST  methodologies  are  based  on  the  use  of  on-chip  test  circuitry  to  provide  test 
patterns  to  the  circuitry  of  the  chip  and  to  analyze  the  response  of  the  circuitry  of  the 
chip  to  the  test  patterns;  the  methodologies  vary  only  in  how  the  test  pattern  generators 
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and  response  analyzers  are  implemented.  We  now  describe  several  pseudorandom 
BIST  methodologies  of  special  interest;  more  complete  treatments  can  be  found  in 
[AbBF90]  and  [BaMS87]. 

2.6.1  Conventional  BIST 

The  conventional  BIST  methodology  uses  linear  feedback  shift  registers,  also  called 
test  pattern  generation  registers  (TPGRs),  to  provide  the  test  patterns.  Figure  2-3 
shows  a  typical  test  pattern  generation  register.  The  register  itself  is  a  deterministic 


Figure  2-3.  A  test  pattern  generation  register  (TPGR). 


system;  however,  as  it  is  clocked,  its  states  over  time  satisfy  some  of  the  same  proper¬ 
ties  as  uniformly  distributed  random  numbers,  and  so  the  TPGR  is  said  to  provide 
pseudorandom  patterns.  Conventional  BIST  also  makes  use  of  multiple  input  shift  reg¬ 
isters  (MISRs)  as  shown  in  Figure  2-4.  Rather  than  checking  the  response  of  the  cir¬ 
cuit  to  each  test  pattern  individually,  conventional  BIST  uses  an  MISR  to  compress  the 
responses  over  time  into  a  single  signature  that  can  be  checked  at  the  end  of  the  test 
session,  after  all  test  patterns  have  been  applied.  Since  the  compression  of  circuit 
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Figure  2-4.  A  multiple  input  shift  register  (MISR). 

responses  into  a  signature  involves  some  loss  of  information,  getting  the  expected 
fault-free  signature  is  not  a  guarantee  that  no  fault  exists  in  the  circuit;  it  is  possible  for 
a  fault  effect  to  be  masked.  The  probability  that  a  fault  effect  is  masked  is  referred  to  as 
the  aliasing  probability. 

When  BIST  is  added  to  a  circuit,  the  circuit  is  usually  partitioned  into  manageable 
pieces  called  kernels,  and  BIST  registers  are  placed  at  die  inputs  and  outputs  of  each 
kernel.  For  example,  in  a  register  transfer  level  (RTL)  datapath,  the  main  focus  may  be 
on  testing  the  arithmetic  logic  units  (ALUs),  and  so  each  ALU  may  be  in  a  kernel  by 
itself,  with  TPGRs  at  the  inputs  to  the  ALU,  and  an  MISR  at  the  output  of  the  ALU.  If 
some  signal  is  at  the  input  of  one  kernel  Ki  and  the  output  of  another  kernel  K2,  a  third 
type  of  test  register,  the  built-in  logic  block  observation  (BELBO)  register,  may  be  used 
[KoMZ79].  This  is  a  “double  duty”  register  that  can  act  as  either  a  test  pattern  genera¬ 
tion  register  or  a  multiple  input  shift  register,  depending  on  its  mode  of  operation  at  a 
given  time.  Since  the  BILBO  can  not  generate  test  patterns  and  compress  circuit 
responses  simultaneously,  kernels  Kj  and  K2  must  be  tested  in  separate  test  sessions. 
This  is  shown  in  Figure  2-5;  during  the  test  session  for  Kj,  the  BILBO  acts  as  an 
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MISR,  compacting  the  responses  of  kernel  Ki,  and  during  the  test  session  for  K2,  the 
BILBO  acts  as  an  TPGR,  providing  test  patterns  for  kernel  K2. 
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in  test  session  2 

(a)  structure  of  circuit.  (b)  operation  of  BILBO  during  the  test. 

Figure  2-5.  The  use  of  a  BILBO  to  test  two  adjacent  kernels. 


Once  a  circuit  has  been  partitioned  appropriately  into  kernels,  and  test  registers  have 
been  added,  test  scheduling  must  be  done;  this  step  looks  at  the  requirements  on  the 
BILB  Os  of  the  circuit,  and  determines  which  kernels  may  be  tested  in  parallel  without 
conflict.  Based  on  this,  the  kernels  are  divided  up  into  test  sessions.  A  single  test  ses¬ 
sion  sets  the  BILBOs  of  the  circuit  to  specific  operational  modes,  and  then  uses  those 
modes  to  test  the  kernels  assigned  to  it.  Note  that,  in  general,  conventional  BIST 
requires  a  complex  test  controller;  the  test  controller  must  be  able  to  move  from  test 
session  to  test  session  at  the  appropriate  times  by  sending  the  proper  operational  mode 


controls  to  the  BILBOs. 
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To  avoid  the  complicated  problem  of  test  scheduling,  another  kind  of  test  register,  the 
concurrent  built-in  logic  block  observation  register  ( cBILBO),  can  be  used  [WaMc86]. 
The  cBILBO  register  is  like  a  BILBO,  except  that  it  can  generate  test  patterns  and 
compact  test  responses  simultaneously,  thus,  it  eliminates  the  need  to  test  kernels  sep¬ 
arately.  Despite  this  advantage,  the  cBILBO  is  seldom  used  because  of  its  large  area;  it 
is  essentially  as  large  as  a  TPGR  and  an  MISR  combined. 

2.6.2  MISR-based  BIST 

MISR-based  BIST  attempts  both  to  reduce  the  area  overhead  required  for  conven¬ 
tional  BIST  and  to  eliminate  the  need  to  test  kernels  separately.  The  basic  scheme  is 
shown  in  Figure  2-6  for  two  kernels.  A  single  test  session  is  used  to  test  both  Ki  and 
K2,  during  that  test  session,  the  MISR  between  the  kernels  compacts  the  responses  of 
Ki,  and  the  partial  signatures  created  in  the  MISR  are  used  as  test  patterns  for  K2.  The 
main  disadvantage  of  MISR-based  BIST  is  that  the  partial  signatures  of  Ki  may  not  be 
good  quality  test  patterns  for  K2. 

2.6.3  Circular  BIST  and  circular  self-test  path 

Circular  built-in  self-test  and  the  circular  self-test  path  technique  have  recently  been 
proposed  as  lower  cost  alternatives  to  conventional  BIST  [KrPi87]  [PiKK92]  [Stro88] 
[POLB88].  Both  methodologies  make  use  of  a  special  test  register,  shown  in  Figure 
2-7,  that  can  simultaneously  both  generate  test  patterns  and  compact  responses.  There 
is  no  feedback  within  the  circular  BIST  test  registers  themselves;  instead,  the  feedback 
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Figure  2-6.  An  MISR-based  BIST  scheme  for  two  kernels. 


Figure  2-7.  A  circular  BIST  test  register. 


necessary  to  pseudorandom  BIST  is  created  in  the  way  that  the  test  registers  are  linked 
together.  In  the  circular  self-test  path  technique,  the  feedback  is  formed  by  linking  the 
test  registers  together  to  form  one  long  circular  self-test  path.  Figure  2-8  shows  an 
example  RTL  structure  with  a  circular  self-test  path  added.  The  circular  self-test  path 
is  shown  with  a  broken  line;  for  this  example,  only  the  primary  input  and  primary  out¬ 
put  registers  are  included  in  the  path,  although  in  general  any  of  the  registers  may  be 
included.  For  this  methodology,  the  signature  is  obtained  either  by  watching  the 
stream  of  bits  flowing  through  a  specified  point  on  the  path  for  a  specified  number  of 
clocks  before  the  end  of  the  test  session  [Stro88]  [PiKK92],  or  by  looking  at  what 
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Figure  2-8.  An  example  RTL  structure  with  a  circular  self-test  path. 


remains  in  the  test  registers  at  the  end  of  the  test  session.  Note  that  there  is  no  need  to 
scan  out  signatures  of  internal  blocks,  since  all  signatures  will  pass  through  a  primary 
output  register. 

For  the  circular  BIST  methodology,  the  test  registers  are  linked  together  into  a  long 
chain;  however,  the  two  ends  of  the  chain  are  not  linked  to  make  a  circle  as  in  the  cir¬ 
cular  self-test  path  technique.  Instead,  the  input  end  of  the  chain  is  driven  by  a  TPGR, 
and  the  output  end  of  the  chain  drives  an  MISR.  Figure  2-9  shows  the  same  circuit  as 
Figure  2-8,  using  the  circular  BIST  methodology  instead  of  a  circular  self-test  path. 
For  this  methodology,  the  signature  is  obtained  from  the  MISR  at  the  end  of  the  test 
session. 


Both  the  circular  self- test  path  and  circular  BIST  methodologies  offer  several  advan¬ 
tages  over  conventional  BIST,  particularly  in  regards  to  hardware  overhead.  Test  con¬ 
trol  is  much  simpler  for  these  methodologies  than  for  conventional  BIST.  First,  a 
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Rgure  2-9.  An  example  RTL  structure  using  the  circular  BIST  methodology. 

circular  BIST  register  has  fewer  modes  of  operation  than  a  BILBO  register,  so  the  test 
controller  generates  fewer  control  bits.  Secondly,  the  entire  circuit  can  be  tested  in  one 
session,  so  the  test  controller  is  not  responsible  for  managing  flow  of  control  from  one 
test  session  to  the  next,  and  there  is  no  complicated  test  scheduling  problem  to  solve. 

The  main  disadvantages  of  the  circular  methodologies  in  comparison  to  conventional 
BIST  are  questions  about  the  quality  of  the  test  patterns,  which  may  no  longer  be  truly 
pseudorandom,  and  questions  about  the  probability  of  error  masking  or  aliasing  during 
test  response  compaction;  however,  empirical  study  has  shown  the  circular  self-test 
path  technique  to  be  effective  at  both  generating  test  patterns  and  compacting 
responses  [PiKK92]. 
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2. 7  Fitting  DFT  into  the  Design  Flow 

Figure  2- 10  shows  a  typical  ASIC  design  flow  from  concept  to  silicon,  with  the  vari¬ 
ous  levels  of  design  abstraction  at  which  the  design  is  described  along  the  way,  and  the 
tools  that  move  the  design  from  one  level  of  design  abstraction  to  another.  Consider- 
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Figure  2-10.  The  ASIC  design  flow. 
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ation  of  testability  can  begin  at  any  time  in  the  course  of  the  design  flow;  however, 
there  are  many  advantages  to  considering  testability  early  in  the  design  flow.  The  most 
important  advantage  is  that  the  earlier  we  consider  testability,  the  more  flexibility  we 
have  in  the  design.  Design  decisions  have  a  profound  effect  on  the  testability  of  a  fin¬ 
ished  circuit;  if  we  wait  until  a  design’s  physical  implementation  is  nearly  complete  to 
start  thinking  about  how  to  test  it,  we  may  have  already  made  design  choices  that  are 
catastrophic  from  the  point  of  view  of  testability.  Furthermore,  the  higher  the  level  of 
design  abstraction,  the  fewer  the  number  of  components  that  must  be  considered  when 
analyzing  the  circuit  structure.  For  example,  a  circuit  with  thousands  of  gates  may 
have  only  a  small  number  of  register  transfer  level  components;  this  smaller  number  of 
components  make  the  analysis  problem  much  more  computationally  tractable.  Addi¬ 
tionally,  at  higher  levels  of  abstraction  we  may  have  more  information  about  the  cir¬ 
cuit;  for  example,  at  the  algorithmic  level  we  know  the  function  performed  by  the 
circuit.  This  can  guide  us  greatly  in  designing  a  test,  making  the  job  easier  than  if  all 
we  have  is  a  netlist  of  gates  with  no  real  notion  of  behavior  or  how  the  circuit  is  meant 
to  be  used. 


2.8  Testability  Metrics 

In  testing  a  circuit,  the  bottom  line  is  fault  coverage;  a  circuit  is  easily  testable  if  a  test 
can  be  devised  that  detects  a  high  percentage  of  the  potential  faults  in  the  circuit  in  a 
reasonable  amount  of  time.  However,  since  fault  simulation  is  expensive,  and  not 


26 


available  at  levels  of  design  abstraction  above  the  gate  level  unless  synthesis  is  per¬ 
formed  first,  it  is  not  feasible  to  use  fault  simulation  to  guide  design-for-testability. 
Instead,  testability  metrics  are  used  to  capture  the  essential  aspects  of  a  high  quality 
test  indirectly.  The  two  key  components  of  a  high  quality  test  are  the  ability  to  control 
an  internal  signal  in  the  circuit  to  some  value  that  is  necessary  to  sensitize  a  fault,  and 
the  ability  to  observe  an  internal  signal  to  see  whether  some  fault  effect  has  caused  an 
error  in  that  signal.  Testability  metrics  can  be  divided  into  two  categories:  those  that 
measure  controllability,  and  those  that  measure  observability. 

Each  signal  embedded  within  a  circuit  can  be  thought  of  as  generating  test  patterns  for 
the  part  of  the  circuit  driven  by  the  signal.  Controllability  measures  whether  those  test 
patterns  (for  that  part  of  the  circuit)  are  of  good  quality.  What  metric  should  be  used 
for  controllability  depends  largely  on  the  test  scheme  used.  For  automatic  test  pattern 
generation  (ATPG)-based  test,  the  issue  is  one  of  being  able  to  control  the  signal  in 
question  to  a  value  necessary  to  sensitize  a  fault;  therefore,  a  suitable  metric  should 
measure  how  difficult  it  is  to  set  the  signal  to  the  necessary  value.  We  will  see  several 
ATPG-based  testability  metrics  in  Chapter  Three,  when  we  survey  related  research  in 
design-for-testability.  For  built-in  self-test  (BIST)-based  test,  the  issue  of  controllabil¬ 
ity  is  slightly  different.  We  will  describe  the  BIST  controllability  metrics  that  we  use 
in  this  work  in  Chapter  Four. 

Observability  of  a  signal  measures  the  degree  to  which  we  can  tell  what  is  happening 
at  that  signal,  given  that  we  cannot  look  at  the  signal  directly,  but  instead  can  look  only 
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at  the  primary  ouq)uts  of  the  circuit.  In  ATPG-based  work,  metrics  for  observability 
may  measure  the  difficulty  in  sensitizing  a  path  for  the  fault  effect  from  the  signal  to  a 
primary  output.  We  will  present  our  own  observability  metric,  suitable  for  BIST,  in 
Chapter  Four. 


2.9  Summary 

This  chapter  has  filled  in  background  information  necessary  to  this  research.  A  typical 
ASIC  design  flow  was  presented  in  order  to  clarify  the  levels  of  abstraction  at  which  a 
VLSI  design  can  be  described.  This  concept  is  central  to  our  work,  which  develops 
BIST  insertion  methodologies  applicable  at  various  design  levels  throughout  the  ASIC 
design  flow.  The  two  main  approaches  to  test,  deterministic  and  pseudorandom,  were 
presented  so  that  the  related  research  of  Chapter  Three,  which  uses  both  approaches, 
can  be  better  appreciated.  A  number  of  BIST  schemes  were  presented  preliminary  to 
their  use  in  later  chapters.  Finally,  the  concept  of  testability  metrics,  which  is  central  to 
much  of  the  related  research  in  Chapter  Three  and  is  used  throughout  our  own  work,  is 


described. 


Chapter  3 


Survey  of  Related  Research 


This  chapter  discusses  previous  work  in  design-for-testability,  which  encompasses 
techniques  to  design  digital  circuits  such  that  they  are  easy  to  test  once  synthesized. 
We  discuss  approaches  based  on  both  automatic  test  pattern  generation  (ATPG)  and 
built-in  self-test  (BIST),  because  although  the  two  schemes  are  different,  they  share 
many  of  the  same  objectives.  We  categorize  the  approaches  according  to  the  level  of 
design  abstraction  to  which  each  approach  is  applied.  We  further  divide  the 
approaches  into  testability  insertion  approaches  and  synthesis-for-testability 
approaches. 

Testability  insertion  approaches  take  a  design  at  a  given  level  of  abstraction  and  mod¬ 
ify  it,  resulting  in  a  more  testable  design  at  the  same  level  of  abstraction.  In  contrast, 
synthesis-for-testability  approaches  modify  the  synthesis  process  itself,  so  that  the  cre¬ 
ated  circuits  are  more  testable.  For  example,  it  has  been  shown  that  it  is  difficult  to 
generate  deterministic  tests  for  sequential  circuits  with  feedback  loops  and  large 
sequential  depth.  Thus,  many  ATPG-based  design-for-testability  approaches  focus  on 
the  elimination  of  feedback  and  the  minimization  of  sequential  depth.  For  testability 
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insertion  approaches,  elimination  of  feedback  may  entail  the  addition  of  test  circuitry 
that  breaks  any  existing  feedback  during  the  test;  for  synthesis-for-testability 
approaches,  the  synthesis  tool  may  be  modified  so  that  it  will  not  create  large  feedback 
loops. 

Design-for- testability  approaches  have  another  natural  division  based  on  the  method 
used.  Structure-based  techniques  place  explicit  restrictions  on  the  structure  of  the  cir¬ 
cuit  in  an  attempt  to  improve  testability;  for  example,  they  may  specify  that  a  circuit 
contain  no  large  feedback  loops.  Metrics-based  techniques  use  testability  metrics  that 
implicitly  capture  information  about  testability  problems  like  feedback  loops  and  large 
sequential  depths.  Metrics-based  testability  insertion  techniques  are  typically  iterative 
in  nature.  First,  testability  metrics  are  computed  for  points  throughout  a  circuit.  Next, 
one  point  is  selected  for  testability  improvement,  based  on  the  metrics  values,  and 
some  modification  is  made  to  the  circuit  to  improve  the  testability  at  that  point.  The 
metrics  for  the  circuit  are  recomputed,  and  the  process  is  iterated  until  all  points  of  the 
circuit  have  acceptable  testability  properties. 
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ATPG-based 

BIST-based 

logic  level 
testability 
insertion 

♦  Princeton:  [BhJh93] 

♦  Iowa:  [PoRe93] 

♦  AT&TERC:  [POLB88] 

[LiZB93] 

♦  Karlsruhe:  [StWu94] 

♦  this  work:  Chapter  Seven 

register 
transfer  level 
testability 
insertion 

♦  IMEC:  [SCDM93] 

♦  Linkoping:  [KuPe89] 

[GuKP94] 

♦  Illinois:  [ChWS91a] 

[ChWS91b] 

[ChSa93] 

[ChLP92] 

♦  USC:  [LiNB93] 

♦  UCSD:  [OrHa93] 

1 

♦  this  work:  [CaPa95] 

Chapter  Eight 

state  table 
testability 
insertion 

♦  Rutgers:  [ChKA92] 

[KaCA93] 

[KaCA95] 

♦  Iowa:  [PoRe93] 

high  level 
synthesis-for¬ 
tes  tability 

♦  Princeton:  [LWJA92] 

[LeJW93a] 

[LeJW93b] 

[BhJh94] 

♦  Montreal:  [JaKa93] 

♦  Wisconsin:  [MuJS94a] 

[MuJS94b] 

♦  Linkoping:  [Peng95] 

♦  Cantabria:  [FeSV94] 

♦  CWRU:  [PaCH91] 

[ChPa91]  I 

[HaPa93] 

♦  UCSD:  [HaOr94a] 

[HaOr94b] 

[VaOr95] 

algorithmic 

level 

testability 

insertion 

♦  NEC  USA:  [DePo94] 

♦  niinois:  [ChKS94] 

♦  UT  Austin:  [ViAA92] 

[VTAA93] 

[ThVA94] 

this  work:  [PaCa95] 

Chapter  Nine 

j 

1 

Table  3-1.  Related  research  in  design-for-testability. 
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3. 1  Logic  Level  Testability  Insertion 

We  now  summarize  several  projects  for  testability  insertion  in  logic  level  circuits. 

3.1.1  Princeton  University 

Bhatia  and  Jha  [BhJh93]  decide  where  to  place  scan  elements  for  ATPG-based  circuits 
by  using  controllability  and  observability  metrics  that  implicitly  capture  the  presence 
of  feedback  loops  and  the  sequential  depth.  This  work  defines  controllability  and 
observability  metrics  for  each  latch,  primary  input,  and  primary  output  in  the  circuit. 
Ihimary  inputs  and  scan  chain  elements  have  high  controllability;  for  a  non-scan  latch 
L,  controllability  is  defined  to  be  the  average  of  the  controllabilities  of  all  latches  and 
primary  inputs  driving  the  latch  L  through  combinational  logic,  minus  a  factor  to  take 
into  account  the  increased  difficulty  in  controlling  L  due  to  its  being  a  larger  sequential 
depth  from  the  primary  inputs.  Primary  outputs  and  scan  chain  elements  have  high 
observability;  for  a  non-scan  latch  L,  observability  is  defined  as  the  average  of  the 
observabilities  of  all  latches  and  primary  outputs  that  latch  L  drives  though  combina¬ 
tional  logic,  minus  a  sequential  depth  factor.  An  iterative  approach  chooses  latches  one 
at  a  time  for  inclusion  in  the  scan  chain,  using  the  sum  of  controllability  and  observ¬ 
ability  as  an  overall  testability  metric. 
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3.1.2  University  of  Iowa 

Work  in  [PoRe93]  takes  a  significantly  different  view  of  logic  level  design-for-test- 
ability.  This  work  views  a  logic  level  circuit  as  a  state  machine,  and  concentrates  on 
modifying  that  state  machine  to  make  the  state  table  strongly  connected.  A  state  table 
is  strongly  connected  if  it  is  possible  to  traverse  from  any  state  to  any  other  state.  The 
state  table  is  modified  by  adding  state  transitions  to  it;  this  is  done  by  placing  some  of 
the  state  elements  in  a  partial  scan  chain.  The  authors  describe  how  strong  connectivity 
is  essential  in  detection  of  a  certain  class  of  faults. 

3.1.3  AT&T  Engineering  Research  Center 

The  work  of  [POLB88]  focuses  on  circular  BIST  insertion,  using  a  number  of  testabil¬ 
ity  metrics  to  make  decisions  about  which  flip-flops  should  be  included  in  the  circular 
BIST  chain.  For  each  flip-flop,  four  different  testability  metrics  are  computed.  The 
first  two  are  based  closely  on  the  structure  of  the  cone  of  combinational  logic  driving 
the  flip-flop;  the  first  counts  how  many  flip-flops  drive  that  cone  of  logic,  and  the  sec¬ 
ond  measures  the  degree  of  reconvergent  fanout  within  the  cone.  The  third  and  fourth 
metrics  measure  the  degree  of  difficulty  in  controlling  and  observing  the  flip-flop,  and 
are  patterned  after  the  SCOAP  metrics  of  [Gold79].  An  iterative  approach  chooses 
flip-flops  one  at  a  time  for  inclusion  in  the  circular  BIST  chain,  using  a  weighted  aver¬ 
age  of  the  four  metrics  as  an  overall  testability  metric. 
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Work  by  another  group  at  the  AT&T  Engineering  Research  Center  looks  at  a  combina¬ 
tion  of  partial  scan  and  BIST  [LiZB93].  Some  of  the  flip-flops  are  selected  for  inclu¬ 
sion  in  a  scan  chain;  the  selection  is  structure-based,  and  concentrates  on  breaking  all 
cycles  and  some  self-loops.  Rather  than  deriving  deterministic  test  patterns  to  shift 
into  the  scan  chain,  as  is  usual  for  partial  scan,  the  method  drives  the  scan  chain  with 
pseudorandom  patterns  from  a  TPGR.  The  output  of  the  scan  chain  is  observed 
through  an  MISR.  Additional  points  may  be  selected  for  BIST-type  test  point  insertion 
based  on  probabilistic  controllability  and  observabihty  metrics.  If  some  signal  has  low 
controllability,  circuitry  is  added  to  bring  a  new  pseudorandom  bit  to  that  point  during 
test.  If  some  signal  has  low  observability,  an  additional  primary  output  is  added  at  that 
point. 


3.1.4  Universities  of  Karlsruhe  and  Siegen 

Work  described  in  [StWu94]  describes  a  method  for  BIST  insertion  that  spans  the 
logic  and  register  transfer  levels.  The  method  begins  by  deciding  which  flip-flops  in  a 
logic  level  circuit  need  to  be  test  flip-flops;  for  each  test  flip-flop,  a  mode  vector  is 
formed  showing  whether  the  test  flip-flop  is  used  to  generate  test  patterns  or  collect 
test  responses  during  each  of  the  test  sessions.  The  method  then  groups  the  test  flip- 
flops  together  to  form  test  registers,  using  the  mode  vectors  to  group  test  flip-flops  that 
perform  the  same  function  at  the  same  time  together  into  a  single  test  register.  In  this 
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way,  BIST  insertion  can  be  done  with  less  overhead  than  is  necessary  for  a  register 
transfer  level  approach  that  makes  decisions  in  terms  of  whole  registers.  This  method 
is  most  applicable  to  control-dominated  structures  that  contain  many  self-loops. 


3.2  Register  Transfer  Level  Testability  Insertion 

At  the  register  transfer  level,  testability  insertion  decisions  are  made  in  terms  of  whole 
registers  rather  than  individual  flip-flops  or  latches.  The  major  advantage  in  doing 
insertion  at  the  register  transfer  level  is  a  reduction  in  computational  complexity;  since 
the  elements  in  an  RTL  circuit  analysis  are  arithmetic  logic  units,  registers,  and  multi¬ 
plexers  rather  than  flip-flops  or  latches  and  gates,  there  is  an  order  of  magnitude  fewer 
elements  to  consider  at  the  RTL  than  at  the  gate  level. 

3.2.1  IMEC  (Belgium) 

The  technique  of  [SCDM93]  uses  partial  scan  to  break  all  die  feedback  loops  in  a  reg¬ 
ister  transfer  level  circuit.  Based  on  structural  analysis  of  the  circuit,  the  technique 
uses  a  depth-first  search  to  enumerate  all  feedback  loops  and  their  possible  cuts,  and 
then  uses  a  branch-and-bound  algorithm  with  some  pruning  to  find  an  optimal  solution 
in  terms  of  the  minimal  number  of  latches  that  must  be  included  in  the  scan  chain  in 
order  to  break  all  the  loops.  The  approach  does  not  rely  solely  on  turning  existing 
latches  into  scan  latches  to  break  feedback;  it  also  allows  the  addition  of  new  scan 


latches  that  are  transparent  in  normal  mode. 
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3.2.2  Linkoping  University 

Like  the  work  in  [SCDM93],  the  work  described  in  [KuPe89]  and  [GuKP94]  uses  par¬ 
tial  scan  to  break  the  feedback  loops  in  register  transfer  level  circuits;  however,  it  uses 
testability  metrics  to  decide  which  of  the  registers  involved  in  the  feedback  should  be 
made  scan  registers,  and  where  to  place  T-cells,  which  are  essentially  scan  elements 
that  are  transparent  in  normal  mode.  The  work  defines  testability  as  a  function  of  four 
metrics:  combinational  controllability,  combinational  observability,  sequential  control¬ 
lability,  and  sequential  observability.  Primary  inputs  have  high  combinational  control¬ 
lability,  and  the  output  of  an  arithmetic  logic  unit  (ALU)  has  a  combinational 
controllability  that  is  the  average  of  the  controllabilities  of  the  inputs  to  the  ALU, 
times  a  factor  that  represents  the  decrease  in  controllability  due  to  the  functionality  of 
the  ALU.  Primary  outputs  have  high  combinational  observability.  The  input  of  an 
ALU  has  combinational  observability  that  is  the  product  of  three  factors:  the  average 
of  the  observabilities  of  the  outputs  of  the  ALU;  a  factor  that  represents  the  decrease  in 
observability  due  to  the  functionality  of  the  ALU;  and  the  average  of  the  controllabili¬ 
ties  of  the  other  inputs  to  flie  ALU.  The  sequential  metrics  represent  the  number  of 
clock  cycles  necessary  to  control  or  observe  an  element.  Overall  testability  is  defined 
as  a  function  of  the  four  metrics,  and  an  iterative  approach  is  used  for  the  actual  inser¬ 
tion.  First,  the  feedback  loops  are  broken  by  selecting  one  register  with  poor  testability 
in  each  loop  for  insertion.  Then,  metrics  are  recomputed,  and  any  other  registers  with 
poor  testability  are  fixed,  one  by  one,  re-computing  the  metrics  after  each  insertion. 
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Finally,  T-cell  insertion  is  used  to  improve  the  metrics  of  any  lines  with  poor  testabil¬ 
ity. 


3.2.3  University  of  Illinois 

Work  presented  in  [ChWS91a],  [ChWS91b],  and  [ChSa93]  selects  registers  in  register 
transfer  level  circuits  for  partial  scan,  with  the  goal  of  easing  automatic  test  pattern 
generation.  It  guides  the  selection  based  on  a  testability  analysis  of  a  data  flow  graph 
representing  the  circuit’s  behavior;  based  on  this  behavioral  analysis,  which  looks  for 
justification  and  propagation  paths,  registers  are  classified  in  terms  of  whether  or  not 
they  are  completely  controllable  and  observable.  When  a  node  is  not  completely  con¬ 
trollable,  meaning  that  there  is  some  value  that  can  not  be  Justified  at  that  node,  partial 
scan  insertion  is  done  to  fix  the  deficiency. 

Another  project  at  the  University  of  Illinois  also  does  partial  scan  insertion  in  register 
transfer  level  circuits  [ChLP92]  based  on  behavioral  analysis;  however,  this  work  uses 
a  different  set  of  testability  metrics.  Here,  controllability  of  a  node  is  measured  by  esti¬ 
mating  the  minimum  and  maximum  number  of  instruction  cycles  required  to  set  that 
node  to  a  given  value.  Observability  is  measured  similarly,  with  estimates  of  the  mini¬ 
mum  and  maximum  number  of  instruction  cycles  needed  to  observe  a  given  value  at 
the  node.  The  first  step  of  insertion  is  to  choose  scan  registers  to  break  all  reconvergent 
fanout;  this  is  done  because  the  computation  method  for  the  metrics  does  not  properly 
handle  reconvergent  fanout.  Next,  an  iterative  procedure  is  used  to  select  additional 
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scan  registers,  based  on  the  testability  metrics  and  the  cost  associated  with  a  given 
selection. 

3.2.4  University  of  Southern  California 

Work  in  [LiNB93]  describes  a  method  for  using  BIST  to  test  each  arithmetic  logic  unit 
(ALU)  in  a  register  transfer  level  circuit  without  resorting  to  the  kind  of  full  BIST 
implementation  traditionally  used,  with  test  registers  directly  at  the  inputs  and  outputs 
of  each  ALU.  Instead,  an  analysis  looks  at  possible  embedded  test  environments  for 
each  ALU.  These  embedded  test  environments  are  based  on  the  identification  of  I- 
paths  within  the  circuit^;  these  are  paths  that  can  be  used  to  bring  a  test  pattern  from  a 
TPGR  or  BILBO  in  some  other  part  of  the  circuit  to  the  input  of  the  ALU  in  question, 
or  to  bring  a  test  response  from  the  output  of  the  ALU  in  question  to  an  MISR  or 
BILBO  in  some  other  part  of  the  circuit.  In  general,  there  are  many  possible  ways  to 
add  BIST  registers  to  a  circuit  so  that  a  suitable  test  environment  exists  for  each  ALU. 
A  branch-and-bound  technique  is  employed  to  prune  the  space  of  possible  designs, 
based  on  area  overhead  and  test  application  time  for  overall  designs,  which  is  deter¬ 
mined  by  doing  test  scheduling  and  includes  an  estimate  of  the  test  length  required  for 
individual  ALUs. 


1.  The  “I”  in  “I-path”  stands  for  “identity”,  and  refers  to  the  fact  that  a  signal  does  not 
change  in  value  as  it  is  propagated  along  the  path. 
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3.2.5  University  of  California  at  San  Diego 

Work  in  [OrHa93]  describes  a  method  for  BIST  insertion  in  register  transfer  level 
datapaths.  The  BIST  insertion  is  guided  by  metrics  designed  to  minimize  the  test 
application  time  for  the  resulting  circuit.  These  metrics  take  into  account  all  possible 
test  paths  that  can  be  used  for  testing  a  module.  The  metrics  also  measure  the  length  of 
a  test  session  for  a  module  when  a  given  test  path  is  used;  this  length  is  based  on  the 
randomness  and  transparency  properties  of  the  signals  at  the  inputs  and  outputs  of  the 
module  as  defined  in  [ChPa91]. 


3.3  State  Table  Testability  Insertion 

This  group  of  research  takes  a  different  view  of  register  transfer  level  design-f or- test¬ 
ability;  rather  than  working  on  datapaths,  which  are  interconnections  of  ALUs,  multi¬ 
plexers,  and  registers,  these  projects  view  the  circuit  as  a  state  table  or  finite  state 
machine.  This  is  a  common  way  to  view  the  controller  part  of  a  datapath  /  controller 
pair,  and  general  circuits  can  be  viewed  in  this  way  as  well.  The  projects  described 
here  make  modifications  in  the  state  table  description  of  the  circuit,  so  that  when  the 
circuit  is  synthesized  to  the  logic  level  it  can  be  easily  tested. 

3.3.1  Rutgers  University  et.  al. 

Collaborative  research  done  at  Rutgers  University,  NEC  USA’s  C&C  Research  Labo¬ 
ratories  and  AT&T  Bell  Laboratories  alters  the  state  table  description  of  a  circuit. 
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embedding  a  test  machine  within  the  design  machine  [ChKA92]  [KaCA93] 
[KaCA95].  The  idea  is  to  add  transitions  to  the  state  table  that  are  taken  only  when  the 
state  machine  is  in  test  mode.  The  resulting  state  machine,  when  synthesized,  has  two 
modes  of  operation,  the  original  design  mode  and  a  test  mode.  These  transitions  added 
to  the  machine  to  define  the  test  mode  are  chosen  specifically  to  make  testing  easier. 
Transitions  in  the  test  machine  ensure  that  for  any  state  of  the  circuit,  there  is  a  short 
predetermined  sequence  of  inputs  that  can  be  used  to  set  the  circuit  to  that  state,  and  a 
short  predetermined  sequence  of  inputs  that  can  be  used  to  propagate  the  state  to  the 
primary  outputs. 

3.3.2  University  of  Iowa 

In  an  extension  of  logic  level  design-for-testability  work  described  earlier  in  this  chap¬ 
ter,  work  in  [PoRe93]  modifies  the  state  table  description  of  a  circuit  by  adding  state 
transitions.  This  technique  focuses  on  making  the  state  table  strongly  connected,  for 
reasons  already  described.  In  the  logic  level  work,  new  state  transitions  were  effected 
using  partial  scan.  At  this  level,  the  circuit  has  not  yet  been  synthesized,  so  it  is  possi¬ 
ble  to  simply  add  the  new  state  transitions  directly  to  the  state  table. 
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3.4  High  Level  Synthesis-for-Te stability 

High  level  synthesis-for-testability  strives  to  modify  the  high  level  synthesis  process 
so  that  the  resulting  synthesized  circuits  are  easily  testable.  [AvMc94]  presents  an 
excellent  overview  of  university  research  efforts  in  this  area. 

3.4.1  Case  Western  Reserve  University 

[PaCH91]  and  [HaPa93]  describe  the  use  of  testability  constraints  to  guide  high  level 
synthesis  for  BIST-based  circuits.  Their  allocation  scheme  restricts  the  structure  of  the 
synthesized  circuit,  preventing  the  formation  of  certain  self-loops  that  can  degrade  the 
quality  of  BIST  tests,  hi  [ChPa91],  testability  metrics  are  used  to  guide  the  high  level 
synthesis  process;  these  metrics  are  used  to  decide  when  controllable  and  observable 
points  may  be  ranoved,  and  therefore  the  metrics  allow  the  system  to  trade-off  hard¬ 
ware  overhead  with  test  quality  and  test  application  time. 

3.4.2  Princeton  University 

[LWJA92],  [LeJW93a]  and  [LeJW93b]  describe  work  done  at  Princeton  University  in 
high  level  synthesis  of  circuits  that  are  easily  testable  using  ATPG.  Their  structure- 
based  method  for  register  allocation  avoids  the  creation  of  sequential  loops  during  reg¬ 
ister  binding;  when  loops  can  not  be  completely  avoided,  the  method  relies  on  partial 
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scan  to  break  any  remaining  feedback  during  test.  The  register  allocation  also  reduces 
sequential  depth  and  improves  the  controllability  and  observability  of  the  registers  by 
binding  at  least  one  primary  input  or  output  value  to  each  register  when  possible. 

Another  ATPG-based  project  at  Princeton,  described  in  [BhJh94],  also  concentrates  on 
modification  of  the  algorithm  that  does  allocation  during  high  level  synthesis.  Here, 
though,  the  modification  concentrates  on  maintaining  test  environments  for  each  of  the 
modules  in  the  circuit.  A  test  environment  for  a  module  is  a  set  of  justification  and 
propagation  paths  that  ensure  that  the  proper  test  patterns  can  be  justified  at  the  inputs 
of  the  module,  and  that  the  outputs  of  the  module  can  be  propagated  to  the  primary 
outputs.  By  ensuring  that  each  module  has  a  test  environment  during  each  iteration  of 
the  allocation  process,  the  method  ensures  that  the  fmal  synthesized  circuit  will  be 
testable  using  ATPG. 

3.4.3  Ecole  Polytechnique  de  Montreal 

[JaKa93]  describes  metrics  for  register  transfer  level  testability  analysis.  The  metrics 
are  designed  to  be  used  to  guide  high  level  synthesis  through  incorporation  with  the 
cost  function  used  by  most  high  level  synthesis  systems  to  choose  between  competing 
designs.  The  metrics,  which  are  based  on  previous  gate  level  work,  measure  the  diffi¬ 
culty  of  setting  a  signal  to  a  specified  value,  and  of  propagating  a  signal  error  to  a  pri¬ 
mary  output.  A  heuristic  for  computing  the  metrics  is  based  on  reduced  ordered  binary 


decision  diagrams. 


42 


3.4.4  University  of  Wisconsin 

Work  done  at  the  University  of  Wisconsin  [MuJS94a]  [MuJS94b]  modifies  the  bind¬ 
ing  algorithm  in  an  attempt  to  synthesize  a  circuit  that  is  more  testable  using  ATPG. 
Their  algorithm,  which  maps  the  binding  problem  onto  a  minimum-cost  network  flow 
problem,  tries  to  avoid  bindings  that  create  self-loops  and  cycles,  and  that  create  regis¬ 
ters  that  are  not  easily  controllable  and  observable.  It  does  this  by  associating  high 
costs  with  these  undesirable  bindings. 

3.4.5  Linkoping  University 

[Peng95]  proposes  the  use  of  controllability  and  observability  metrics  to  guide  high 
level  synthesis  of  ATPG-based  circuits.  The  method  works  by  modifying  the  alloca¬ 
tion  process  to  take  testability  into  account.  An  effort  is  made  to  merge  nodes  with 
good  controllability  and  bad  observability  to  nodes  with  good  observability  and  bad 
controllability;  the  premise  is  the  merged  node  will  retain  the  good  controllability  of 
the  first  node  and  the  good  observability  of  the  second  node. 

3.4.6  University  of  Cantabria 

[FeSV94]  describes  a  high  level  synthesis  system  for  ATPG-based  circuits.  The  sys¬ 
tem  is  guided  by  testability  metrics  found  using  sampling  while  random  input  vectors 
are  applied  to  the  input  algorithmic  description.  For  a  single  bit,  controllability  is  mea¬ 
sured  in  terms  of  the  probability  of  forcing  the  bit  to  a  particular  value,  and  observabil¬ 
ity  is  measured  in  terms  of  the  probability  of  seeing  a  change  in  the  bit  at  a  primary 
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output.  The  controllability  and  observability  of  a  signal  is  defined  to  be  the  minimum 
of  the  controllabilities  and  observabilities  of  the  individual  bits  that  make  up  the  sig¬ 
nal.  An  additional  sequential  depth  factor  is  used  to  account  for  difficulty  in  control¬ 
ling  and  observing  signals  that  are  far  from  the  primary  inputs  and  outputs.  Another 
metric,  the  loop-based  test  cost,  is  used  to  quantify  the  ill  effect  that  the  presence  of 
loops  can  have  on  the  number  of  scan  registers  that  will  be  needed  to  break  loops,  and 
the  high  test  pattern  generation  cost  and  low  fault  coverage  typical  of  circuits  with 
loops. 

3.4.7  University  of  California  at  San  Diego 

Work  in  [HaGr94a],  [HaOr94b]  and  [VaC)r95]  develops  a  high  level  synthesis  method 
based  on  testability  metrics  that  are  designed  to  be  helpful  in  minimizing  the  test  appli¬ 
cation  time  required  for  BIST.  The  metrics  are  based  on  the  probabilities  that  two 
given  modules  will  be  bound  together  by  the  synthesis.  Conflict  and  coverage  metrics 
are  indirectly  related  to  the  number  of  test  sessions  that  will  be  required  for  the  final 
synthesized  circuit;  conflict  measures  the  degree  of  conflict  for  hardware  resources 
that  will  occur  when  an  attempt  to  test  the  circuit  in  one  test  session  is  made,  and  cov¬ 
erage  measures  the  accessibility  of  individual  modules.  A  correlation  metric  measures 
degradation  in  test  pattern  quality  due  to  reconvergent  fanout  both  inherent  to  the  data 
flow  and  caused  by  the  binding.  The  authors  show  how  the  metrics  can  be  updated 
during  synthesis,  with  the  initial  probabilistic  analysis  becoming  more  and  more  deter¬ 
ministic  as  binding  decisions  are  made. 
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3.5  Algorithmic  Level  Testability  Insertion 

The  following  group  of  testability  insertion  approaches  work  by  modifying  the  algo¬ 
rithmic  level  design  description  prior  to  high  level  synthesis. 

3.5.1  University  of  Illinois 

Work  in  [ChKS94]  extends  previous  ATPG-based  work  at  the  register  transfer  level 
upwards  to  the  algorithmic  level.  The  same  behavioral  testability  metrics  used  in 
[ChWS91a],  [ChWS91b],  and  [ChSa93]  to  select  registers  for  partial  scan  are  used 
here  as  a  basis  for  modifying  a  data  flow  graph  through  the  insertion  of  test  statements. 
The  test  statements,  executed  when  the  circuit  is  placed  in  test  mode,  make  it  easier  to 
find  the  justification  and  propagation  paths  necessary  for  ATPG. 

3.5.2  NEC  USA  C&C  Research  Laboratories 

Work  in  [DePo94]  proposes  modifying  the  behavioral  specification  of  a  design,  in  the 
form  of  a  control-data  flow  graph,  prior  to  high  level  synthesis  so  that  the  synthesized 
circuit  can  be  easily  tested  using  ATPG  with  a  minimum  amount  of  hardware  over¬ 
head.  Deflection  operations,  which  serve  to  move  operations  of  the  data  flow  graph 
without  changing  the  overall  functionality  of  the  design,  are  added  with  an  eye 
towards  reducing  the  number  of  scan  flip-flops  that  will  eventually  be  needed  to  break 
all  loops  in  the  synthesized  datapath.  Their  behavioral  transformation  approach  is  spe¬ 
cifically  designed  to  be  used  with  their  own  high  level  synthesis  system,  BETS. 
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3.5.3  University  of  Texas  at  Austin 

Work  in  [ViAA92],  [VTAA93]  and  [ThVA94]  modifies  an  algorithmic  level  behav¬ 
ioral  description  prior  to  high  level  synthesis  in  an  attempt  to  create  a  circuit  that  will 
be  easily  testable  when  synthesized;  their  method  is  demonstrated  using  an  ATPG- 
based  test.  Their  method  looks  for  parts  of  the  input  behavior  that  are  incompletely 
specified;  for  example,  it  looks  for  control  modes  that  are  unused.  The  method  then 
takes  advantage  of  the  unspecified  parts  of  the  behavior  by  adding  functionality  that  is 
conducive  for  testability;  for  example,  an  unused  control  mode  might  be  used  to  access 
a  part  of  the  circuit  that  is  ordinarily  difficult  to  access. 


3.6  Summary 

This  chapter  has  presented  a  survey  of  related  work  recently  done  in  testability  inser¬ 
tion  and  synthesis-for-testability.  Our  own  testability  insertion  methodologies, 
described  in  Chapters  Seven,  Eight,  and  Nine  for  the  logic  level,  register  transfer  level, 
and  algorithmic  level,  respectively,  are  meant  to  add  to  this  existing  work  by  looking 
at  the  same  problems  with  a  different  emphasis. 


Chapter  4 


BIST  Testability  Metrics 


In  Chapter  Three,  we  saw  a  wide  variety  of  testability  metrics  used  in  related  research 
for  both  automatic  test  pattern  generation-  (ATPG-)  based  and  built-in  self-test- 
(BIST-)  based  design-for-testability.  In  this  chapter,  we  present  a  more  detailed  look  at 
the  testability  metrics  used  in  our  work.  As  is  the  case  with  most  testability  metrics, 
our  metrics  are  divided  into  two  categories:  those  designed  to  measure  controllability, 
and  those  designed  to  measure  observability. 


4.1  Metrics  for  Controllability 

The  basic  premise  behind  built-in  self-test  is  that  it  is  possible  to  design  circuits  such 
that  they  can  be  easily  tested  using  pseudorandom  test  patterns.  Therefore,  a  natural 
choice  for  measuring  the  controllability  of  a  signal  is  a  metric  that  quantifies  the  ran¬ 
domness  of  the  signal.  Entropy,  a  standard  notion  from  information  theory  [Papo84], 
quantifies  the  uncertainty  about  the  outcome  of  an  event,  and  is  based  on  the  probabil¬ 
ity  distribution  for  the  underlying  state  space.  For  example,  suppose  that  a  signal  X  is 
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one  bit  wide.  Let  px  denote  the  1 -probability  of  signal  X,  i.e.,  let  px  denote  the  proba¬ 
bility  that  signal  X  takes  bit  value  ‘1’.  Signal  X  takes  bit  value  ‘0’  with  probability 
1  -  .  If  =  1 ,  we  know  that  signal  X  always  takes  on  bit  value  ‘  1’ ;  similarly,  if 

p^  =  0 ,  we  know  that  signal  X  always  takes  on  bit  value  ‘0’ .  In  both  of  these  cases, 
we  are  certain  about  the  outcome,  and  the  entropy  of  the  signal  is  zero.  If  Px  =  0-9,  we 
are  quite  certain  that  the  signal  will  take  on  bit  value  ‘1’,  so  the  entropy,  while  not 
zero,  is  still  quite  low.  We  are  least  certain  about  the  outcome  when  px=  0.5;  in  this 
case,  the  entropy  is  one,  the  maximum  value  for  a  one-bit  signal.  For  one-bit  signals, 
the  full  formula  for  entropy  is: 


'x  =  Px'»8^+(‘-Pj.)log— ^ 


(EQ  4-1) 


For  signals  greater  than  one  bit  wide,  the  principle  of  entropy  is  similar.  For  any  signal 
X,  let  LXl  denote  the  width  of  the  signal  in  bits.  The  current  state  of  the  signal  is  the 
value  taken  by  the  signal,  and  so  at  a  given  time  signal  X  may  be  in  any  one  of  the  fol- 

\X\ 

lowing  states:  0,  1,  2, . . .,  2  -  1 .  Let  X’s  state  probability  distribution  be  denoted  by 

a  row  vector  px ,  where 


- 

Px  = 


^X.O 

Pr  {signal  X  is  in  state  0} 

Px,  1 

Pr  {  signal  X  is  in  state  1 } 

Px,2 

= 

Pr  (signal  X  is  in  state  2} 

^x,  2!^  -  1 

Pr  (signal  X  is  in  state  2^^^  -  1 } 

(EQ  4-2) 


48 


Here,  Prlf}  denotes  the  probability  of  the  event  E.  The  entropy  of  the  signal  X  is 
defined  as: 


2W-I 

«  =  0  ' 


(EQ  4-3) 


Here,  entropy  ranges  from  zero,  when  one  event  occurs  with  probability  one  and  the 
rest  occur  with  probability  zero,  to  IXI,  when  all  events  are  equally  likely  to  occur. 


Following  the  work  of  [ThAb89]  and  [ChPa9l],  we  normalize  entropy  for  use  as  a 
controllability  metric.  We  define  the  randomness  of  a  signal  X  as: 

actual  output  entropy 

MR(X)  =  - : - - — ; - — -  =  .  (EQ  4-4) 

maxunum  output  entropy  \X\ 

As  a  result,  randomness  ranges  from  zero  to  one  regardless  of  the  bit  width.  The  ran¬ 
domness  metric  can  be  thought  of  as  a  comparison  of  a  signal’s  effectiveness  at  gener¬ 
ating  test  patterns  with  the  effectiveness  of  a  “perfect”  test  pattern  generator  that 
generates  uniformly  distributed  random  patterns  and  therefore  has  the  maximum  pos¬ 
sible  entropy.  If  a  signal  has  randomness  zero,  the  signal  generates  a  single  test  pattern 
over  and  over  again;  if  a  signal  has  randomness  one,  it  generates  ail  possible  test  pat¬ 
terns  with  approximately  equal  frequency. 


Another  controllability  metric  that  has  been  used  extensively  by  other  researchers  for 
built-in  self-test  is  that  of  expected  state  coverage  [KrPi87]  [BKK92].  Expected  state 
coverage  for  a  signal  X  is  the  fraction  of  all  2*^  possible  states  for  that  signal  that  are 
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expected  to  be  generated  or  covered  during  a  testing  session  of  a  given  length.  The 
expected  state  coverage  for  signal  X  in  clocks  can  be  written  in  terms  of  X’s  state 
probability  distribution  as: 


ESC(X,iV)=-^  S  (EQ4-5) 

^  =  0 

as  first  derived  in  [DeVH86].  Expected  state  coverage  gives  a  different  perspective 
from  randomness  on  a  signal’s  effectiveness  in  generating  test  patterns.  While  ran¬ 
domness  compares  the  signal  to  a  perfect  test  pattern  source  over  the  long  run, 
expected  state  coverage  tells  us  how  close  the  signal  comes  to  generating  an  exhaus¬ 
tive  test  during  a  test  session  of  fixed  length.  Research  by  Majumdar  and  Sastry 
[SaMa91a]  has  explored  the  relationship  between  state  coverage  and  a  more  direct 
measure  of  test  quality,  XhQ  fault  coverage  obtained  for  a  combinational  logic  block. 

Both  randomness  and  expected  state  coverage  of  a  signal  are  based  on  the  state  proba¬ 
bility  distribution  of  that  signal.  We  now  use  the  circuit  of  Figure  4-1  to  make  a  point 
about  the  calculation  of  the  probability  distributions.  ^  We  restrict  ourselves  tempo¬ 
rarily  to  trees  to  simplify  the  explanation.  The  controllability  of  a  signal  depends  only 
on  the  circuit  that  is  driving  it;  this  is  the  part  of  the  circuit  between  the  primary  inputs 
and  the  signal.  This  means  that  if  we  are  considering  the  controllability  of  signal  Z  in 
Figure  4-1,  we  need  only  consider  subcircuits  A  and  B,  and  the  operation  immediately 


1.  The  symbol  @  denotes  a  generic  operation  or  arithmetic  logic  unit. 
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driving  Z.  We  may  remove  subcircuit  C  entirely,  because  it  has  no  effect  on  the  con¬ 
trollability  of  Z. 


PRIMARY  INPUTS 


I 

PRIMARY  OUTPUT 

Figure  4-1.  Circuit  used  to  illustrate  controllability  and  observability  concepts. 


Actually,  for  simple  trees,  we  can  write  the  state  probability  distribution  of  signal  Z, 
P2,  directly  in  terms  of  the  state  probability  distributions  of  signals  X  and  Y,  px  and 
Py,  and  the  operation  0  used  to  combine  X  and  Y  to  make  Z.  For  example,  suppose 
that  the  operation  is  a  modulo- 16  adder,  so  that  Z = X  +  F,  and  that  all  of  the  signals  are 
four  bits  wide.  We  can  find  the  probability  that  signal  has  value  fifteen  in  terms  of  all 
the  (X,  F)  pairs  that  sum  to  fifteen.  We  have: 
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15  “  ^  { signal  Z  has  value  15} 

=  Pr  (signal  X  has  value  15  and  signal  Y  has  value  0} 

+  Pr  {  signal  X  has  value  14  and  signal  Y  has  value  1 } 

+  Pr  {  signal  X  has  value  13  and  signal  Y  has  value  2} 

+  ... 

+  Pr  {  signal  X  has  value  0  and  signal  Y  has  value  15  }  . 

For  our  tree,  signals  X  and  Y  are  independent,  so  we  can  write  this  as: 

Pz,  15  “  Px,  15  '  Py,0'^  Px,  14  '  Py,1'^  Px,l3'  Py,2'^  Px,  0  ‘  Py,  15  ' 
Other  elements  of  pz  are  calculated  in  a  similar  way. 


It’s  easy  to  see  how  to  compute  the  state  probability  distributions  for  an  overall  tree  of 
operations,  using  the  idea  that  we  can  compute  a  signal  Zs  state  probability  distribu¬ 
tion  in  terms  of  the  state  probability  distributions  of  the  two  signals  X  and  Y  that  imme¬ 
diately  drive  Z.  We  start  at  the  primary  inputs,  which  have  known  state  probability 
distributions;  for  example,  if  a  primary  input  signal  S  is  driven  with  uniformly  distrib¬ 
uted  random  patterns,  it  has  a  state  probability  distribution 


Ps  = 


J_  J_ 
21^1  ■■■ 


To  compute  the  state  probability  distributions  of  the  signals  internal  to  the  tree,  we 
work  down  the  levels  of  the  tree  one  by  one  until  we  reach  the  primary  output. 


For  general  circuits,  the  state  probability  distribution  computation  becomes  more  com¬ 
plicated.  In  Chapter  Five,  we  will  see  a  Markov  model  for  this  computation  that  takes 
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reconvergent  fanout,  direct  feedback  loops,  and  indirect  feedback  loops  into  account. 
It  also  handles  the  presence  of  BIST  registers.  However,  even  in  this  general  case,  the 
state  probability  distribution  of  a  signal  depends  only  on  the  circuitry  driving  the  sig¬ 
nal. 


4.2  A  Metric  for  Observability 

For  our  BIST-based  work,  we  measure  the  observability  of  a  signal  with  a  new  metric 
that  we  cal  transparency.  Transparency  measures  the  probability  that  an  arbitrary 
change  in  the  signal’s  value  can  be  observed  at  the  primary  output.  Transparency 
ranges  from  a  value  of  0  for  signals  that  can  not  be  observed  even  indirectly,  to  a  value 
of  1  for  signals  that  can  be  directly  observed. 

Transparency  is  inherently  more  complicated  a  concept  than  randomness.  To  see  why, 
consider  what  controllability  and  observability  mean  in  the  circuit  of  Figure  4- 1 . 
Again,  we  begin  by  restricting  ourselves  to  trees  of  operations.  As  previously  men¬ 
tioned,  controllability  depends  only  on  the  circuitry  driving  a  signal;  thus,  when  we 
consider  the  controllability  of  signal  Z,  we  need  only  consider  subcircuits  A  and  B, 
and  the  functionality  of  the  operation  directly  driving  signal  Z.  If  observability  were  a 
true  complement  to  controllability,  the  observability  of  a  signal  would  depend  only  on 
that  part  of  the  circuit  that  the  signal  is  driving,  i.e.,  on  that  part  of  the  circuit  between 
the  signal  and  the  primary  output.  In  Figure  4- 1,  if  we  were  considering  the  observ- 
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ability  of  signal  X,  we  would  have  to  consider  only  subcircuit  C  and  the  operation  that 
X  is  immediately  driving;  we  could  throw  subcircuits  A  and  B  away.  However,  this  can 
not  give  us  a  true  picture  of  observability.  In  actuality,  the  observability  of  signal  X 
depends  heavily  on  subcircuits  A  and  B.  For  example,  suppose  that  the  operation  that 
X  is  driving  is  a  multiplication.  The  controllability  of  the  other  input  to  the  multiplier, 
labelled  Y  on  Figure  4- 1,  is  a  crucial  part  of  the  observability  of  X.  If  signal  Y  always 
takes  value  zero,  the  output  of  the  multiplier  will  always  be  zero,  regardless  of  the 
value  of  signal  X\  in  this  case,  signal  X  is  effectively  cut  off  from  ever  reaching  the  pri¬ 
mary  output.  If,  however,  signal  Y  always  takes  value  one,  the  multiplier  will  transmit 
the  signal  X  unchanged  to  its  output,  making  it  easier  to  propagate  X  to  the  primary 
output.  Thus,  the  observability  of  signal  X  can  not  be  considered  without  first  consid¬ 
ering  the  controllability  of  signal  Y  Since  the  controllability  of  signal  Y  comes  from 
subcircuit  B,  we  see  that  we  can  not  remove  subcircuit  B  from  consideration. 

It  turns  out  that  the  observability  of  signal  X  depends  on  subcircuit  A  as  well.  The 
point  behind  observability  is  to  measure  how  easily  we  can  propagate  a  change  in  sig¬ 
nal  X  to  the  primary  output;  this  change  is  caused  by  some  fault  in  the  circuit  driving 
signal  X.  The  changes  that  can  occur  in  signal  X  depend  on  which  values  signal  X  can 
take  on  when  no  faults  are  present.  This  means  that  if  subcircuit  A  is  such  that  signal  X 
is  constrained  to  take  on  only  a  certain  value  during  fault-free  operation,  computing 
the  observability  of  signal  X  is  a  different  problem  than  if  signal  X  is  constrained  in 


some  other  way. 


As  a  result,  the  formula  for  the  transparency  of  signal  X  will  depend  not  only  on  the 
transparency  of  signal  Z,  represented  by  what  we  will  soon  define  as  the  state-based 
transparency  of  Z,  ^ ,  but  also  on  the  controllability  of  signals  X  and  Y,  represented  by 
their  state  probability  distributions  px  and  py.V^Q  now  describe  the  transparency 
computation  in  greater  detail. 

As  a  tool  for  evaluating  transparency,  we  define  a  state-based  transparency  vector  t^ 
for  each  signal  X,  where  each  element  of  the  vector  is  a  kind  of  conditional  transpar¬ 
ency  based  on  the  fault-free  state  of  the  signal.  We  have: 


1 

2 

Pr  {X  is  transparent  X  has  fault-free  state  0} 

Pr  {X  is  transparent  X  has  fault-free  state  1 } 

Pr  {X  is  transparent  X  has  fault-free  state  2} 

/x,  21-^1  -  1_ 

Pr  {X  is  transparent  |X  has  fault-free  state  21-^1  -  1 } 

In  this  definition,  signal  X  is  transparent  if  a  change  in  X  causes  a  change  in  the  pri¬ 
mary  output.  The  basic  idea  behind  tx/i  is  this:  because  of  the  presence  of  some  fault  in 
the  circuit,  the  value  of  signal  X  has  been  perturbed  from  its  fault-free  value  of  i  to 
some  different  value,  is  the  probability  that  the  state  error  can  be  propagated  from 
signal  X  to  the  primary  output.  Once  we  have  the  state-based  transparency  vector  t^ 
and  the  state  probability  distribution  vector  px,  we  can  write  the  transparency  of  sig¬ 


nal  X  as  a  dot  product: 
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The  state-based  transparency  vectors  are  computed  in  bottom-up  fashion,  starting  with 
the  primary  output,  and  moving  signal  by  signal  towards  the  primary  inputs.  The  pri¬ 
mary  output  is  an  observable  point  with  perfect  transparency,  and  therefore  has 
t  =  j  1  1  ...  ij  .  We  move  up  one  level  in  the  circuit  by  separating  the  transparency  of 
a  signal  X  into  two  components:  the  probability  of  propagating  a  state  error  through  a 
single  operation  to  the  next  signal  Z  in  the  behavior  (see  Hgure  4-1);  and  the  probabil¬ 
ity  of  propagating  the  state  error  from  that  next  signal  Z  to  the  primary  output.  Thus, 
we  will  write  the  state-based  transparency  of  X,  in  terms  of  the  state-based  trans¬ 
parency  of  Z,  .  We  have: 
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(EQ  4-7) 


Vy  =  0  A;=0  ; 

This  equation  shows  the  two  components  of  the  transparency  of  signal  X.  The  first  sum 
indicates  the  probability  of  propagating  the  state  error  through  the  single  operation  to 
signal  Z.  S®  is  a  matrix  indicating  the  sensitivity  of  the  operation  ®  to  changes  in  val- 
ues  in  X,  given  a  particular  fault-free  value  of  X  and  value  of  Y.  S  .  .  indicates  the 
probability  that  a  change  in  the  value  of  signal  X,  from  a  fault-free  value  of  i  to  some 
different  value  can  be  observed  at  the  output  of  the  operation  when  signal  Y  has 
value  j.  This  is  the  probability  that  i®j^  i'  ®  j .  Although  in  general  the  entire  sensi¬ 
tivity  matrix  must  be  generated,  for  many  practical  functions  it  is  not  necessary  to 
work  with  full  matrices  at  all.  For  example,  for  addition,  all  elements  of  the  sensitivity 
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matrix  are  equal  to  one,  and  the  entire  first  sum  of  Equation  4-7  collapses  simply  to  the 
value  one. 

The  second  sum  in  Equation  4-7,  based  on  the  transparency  of  Z,  indicates  the  proba- 
bihty  of  propagating  the  state  error  from  signal  Z  the  rest  of  the  way  to  the  primary 
output.  When  the  fault-free  value  of  signal  X  is  i  and  the  value  of  signal  Y  isj,  the 
fault-free  value  of  signal  Z  is  z  0  j ;  this  is  the  reason  that  .  depends  on  ,• 

We  have  not  yet  developed  a  method  for  transparency  computation  in  general  circuits, 
particularly  in  those  circuits  with  a  high  degree  of  reconvergent  fanout.  Fortunately, 
we  have  found  that  it  is  not  necessary  to  handle  such  circuits.  The  reason  for  this  is  that 
our  test  point  insertion  methodologies  always  enhance  controllability  first,  before  con¬ 
sidering  observability.  Controllability  is  considered  first  because  enhancing  controlla¬ 
bility  can  have  an  effect  on  observability.  This  was  the  case  for  our  earlier  example  of 
a  multiplier  that  had  one  input  tied  to  zero;  recall  that  this  circuit  has  an  observability 
problem  because  the  zero  makes  it  impossible  to  propagate  the  signal  on  the  other 
input  of  the  multiplier  through  the  multiplier.  This  circuit  also  has  a  controllability 
problem,  since  the  constant  zero  is  not  a  high  quality  test  pattern  to  test  the  multiplier. 
If  the  controllability  problem  is  fixed  by  modifying  the  circuit  so  that  the  input  takes 
on  a  wider  range  of  values,  the  observability  problem  will  be  fixed,  too.  In  contrast, 
enhancing  observability  has  no  effect  on  controllability;  regardless  of  how  extra 
observable  points  are  added  to  a  circuit,  the  flow  of  signals  through  the  operations  of 


the  circuit  remains  the  same. 
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Because  controllability  enhancement  is  done  before  observability  enhancement,  the 
circuits  for  which  transparency  must  be  computed  are  not  duly  general.  Controllability 
enhancement  frequently  involves  the  removal  of  reconvergent  fanout,  for  example,  so 
in  practice  we  do  not  need  to  be  able  to  compute  transparency  for  circuits  with  com¬ 
plex  patterns  of  reconvergence.  Simpler  patterns  of  reconvergence  are  handled  with  a 
preprocessing  transformation  analogous  to  that  used  for  controllability  analysis  and 
introduced  in  Chapter  Five. 

One  circuit  structure  that  we  do  need  to  handle  is  that  of  fanout  to  more  than  one  pri¬ 
mary  output.  Consider  the  structure  of  Figure  4-2,  where  subcircuit  A  and  subcircuit  B 
do  not  overlap,  i.e.,  where  the  two  paths  to  the  output  are  independent.  We  first  com¬ 
pute  the  transparencies  of  the  two  signals  on  the  fanout  branches,  and  Xb,  by  the 
method  already  described.  Clearly,  the  transparency  of  the  source  signal  X  is  higher 
than  the  transparency  of  either  of  the  branches,  since  the  only  requirement  for  X  to  be 
transparent  is  that  we  be  able  to  propagate  an  error  on  signal  X  to  at  least  one  of  the 
primary  outputs.  Therefore,  there  is  a  choice  of  path:  through  subcircuit  A,  through 
subcircuit  B,  or  both.  We  define  the  transparency  of  the  source  as: 


MT(X)  =  MT(X^)  +MTiXg)  -  MT  (X^)  MT  (X^)  . 


(EQ  4-8) 
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4.3  Summary 

This  chapter  has  provided  mathematical  definitions  for  the  testability  metrics  used 
throughout  our  work.  All  three  of  the  testability  metrics,  randomness,  expected  state 
coverage,  and  transparency,  are  based  on  the  state  probability  distributions  of  the  sig¬ 
nals  in  the  circuit.  We  will  present  a  Markov  model  for  computing  the  state  probability 
distributions  in  Chapter  Five. 

There  are  alternatives  to  using  the  Markov  model  to  compute  the  state  probability  dis¬ 
tributions.  The  probability  distributions  can  be  determined  empirically  with  the  use  of 
Monte  Carlo  simulation  of  the  circuit.  It  is  also  possible  to  develop  heuristics  to  calcu¬ 
late  the  testability  metrics  directly,  without  relying  on  the  underlying  state  probability 
distributions.  Heuristics  have  the  potential  to  be  much  faster  to  compute,  since  they 
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can  work  with  scalar  values  rather  than  the  vectors  and  matrices  required  for  a  Markov 
model  approach,  or  the  vectors  and  long  simulation  times  required  for  Monte  Carlo 
simulation.  Of  course,  heuristics  are  inherently  less  accurate  than  analytical  methods. 
The  development  of  heuristics  for  computation  of  the  testability  metrics  is  beyond  the 
scope  of  this  dissertation. 


Chapter  5 


A  Markov  Model  for  BIST 
Analysis 


This  chapter  develops  a  Markov  model  to  evaluate  the  test  effectiveness  of  a  circuit 
with  built-in  self-test  (BIST)  features.This  model  can  be  applied  at  various  levels  of 
design  abstraction;  here,  we  will  see  its  use  at  the  gate  level  and  register  transfer  level 
in  the  structural  domain,  and  at  the  algorithmic  level  in  the  behavioral  domain. 

The  purpose  of  the  Markov  model  is  to  provide  analytical  values  for  the  probability 
distribution  of  the  state  of  each  signal  in  the  circuit;  testability  metrics  such  as  entropy- 
based  randomness,  expected  state  coverage,  and  transparency  can  be  computed  from 
the  probability  distributions  using  the  formulas  given  in  Chapter  Four.  By  providing  a 
means  for  BIST  analysis  and  evaluation,  the  testability  metrics  allow  us  to  trade  area 
and  performance  against  test  effectiveness  when  doing  BIST  insertion. 

The  Markov  model  described  in  this  chapter  is  based  on  previous  approaches  for  BIST 
analysis  at  the  register  transfer  level.  The  Markov  model  used  here  provides  the  fol¬ 
lowing  advances  over  previous  models  by  Chuang  and  Gupta  [ChGu89]  and  Kim,  Ha, 
and  Tront  [KiHT88]: 
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•  a  preprocessing  transformation  is  used  to  remove  reconvergent  fanout  from 
the  circuits,  allowing  the  effects  of  word-level  correlation  on  test  quality  to 
be  accurately  modeled. 

•  an  iterative  technique  is  developed  so  that  the  model  can  handle  circuits 
with  indirect  feedback,  i.e.,  circuits  in  which  a  register  or  flip-flop  feeds 
back  into  itself  via  one  or  more  intermediate  registers  or  flip-flops. 

•  the  model  is  extended  to  include  the  newer  circular  BIST  methodology  as 
well  as  conventional  BIST. 

This  work  also  uses  the  model  in  a  new  way,  to  evaluate  testability  metrics  as  an  aid  to 
BIST  insertion. 

The  chapter  is  organized  as  follows.  Section  5.1  presents  the  mathematical  details  of 
the  Markov  model  for  register  transfer  level  (RTL)  circuits.  Section  5.2  describes  a 
transformation  technique  for  removing  reconvergent  fanout  from  an  RTL  circuit,  pro¬ 
vides  an  example  transformation,  and  explains  how  the  preprocessing  transformation 
is  used  to  make  the  Markov  model  more  powerful.  Section  5.3  discusses  the  computa¬ 
tional  complexity  of  the  overall  analysis  for  RTL  circuits.  Section  5.4  shows  the  same 
Markov  model  as  applied  to  gate  level  circuits;  derivations  are  given  to  show  that  the 
matrix  computations  necessary  at  the  register  transfer  level  are  reduced  to  scalar  equa¬ 
tions  at  the  gate  level.  Section  5.5  describes  how  the  Markov  model  can  be  applied  at 
the  algorithmic  level,  in  the  behavioral  domain.  Finally,  Section  5.6  presents  a  sum¬ 


mary. 
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5. 1  The  Markov  Model  at  the  RTL 

This  section  describes  a  Markov  model  for  the  analysis  of  register  transfer  level  (RTL) 
circuits  with  BIST  features.  For  the  purposes  of  this  section,  RTL  circuits  consist  of 
registers,  arithmetic  logic  units  (ALUs)  and  buses.  At  this  point,  we  consider  only 
datapaths,  and  not  control  logic.  First,  we  describe  how  the  datapath  is  partitioned 
before  analysis  in  order  to  make  the  problem  more  computationally  tractable.  Next,  we 
describe  the  steps  in  analyzing  a  single  register  of  the  datapath.  Then,  we  outline  the 
iterative  procedure  that  is  used  to  resolve  the  analysis  when  a  datapath  contains  indi¬ 
rect  feedback. 

5.1.1  Partitioning  a  RTL  datapath  for  analysis 

It  is  impossible  to  analyze  most  RTL  datapaths  as  a  whole,  since  a  datapath  with  even 
a  moderate  number  of  registers  has  a  very  large,  unmanageable  number  of  system 
states.  As  a  result,  we  must  partition  the  datapath  into  smaller  pieces  for  analysis.  Note 
that  if  we  were  able  to  analyze  the  datapath  in  one  big  piece,  the  analysis  would  be 
deterministic;  given  a  current  system  state  and  a  set  of  primary  input  values,  there  is 
only  one  possible  next  system  state.  The  price  we  pay  for  partitioning  is  that  any  anal¬ 
ysis  must  now  be  probabilistic,  rather  than  deterministic;  since  we  are  looking  at  only 
a  small  section  of  the  datapath  at  a  time,  we  can  not  speak  with  certainty  about  what  is 
happening  in  the  rest  of  the  datapath. 
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(a)  a  simple  RTL  datapath.  (b)  the  datapath  partitioned. 

Figure  5-1.  An  example  of  partitioning  for  analysis. 


For  this  application,  we’d  like  to  partition  the  datapath  such  that  each  partition  con¬ 
tains  the  information  necessary  to  analyze  a  single  register.  The  rules  for  partitioning 
are  simple;  create  one  partition  for  each  register  of  the  datapath,  where  the  register 
serves  as  the  output  register  of  the  partition.  The  partition  should  consist  of  the  regis¬ 
ter,  any  combinational  logic  that  drives  the  register,  plus  the  registers  that  serve  as 
inputs  to  that  combinational  logic.  The  partition  for  a  register  is  unique.  The  partitions 
represent  the  transfer  of  data  from  register  to  register.  Figure  5-1  shows  a  simple  RTL 
datapath  and  its  partitions.  Note  that  partitioning  does  not  remove  feedback  loops  from 
the  datapath.  Direct  feedback  manifests  itself  as  feedback  within  a  partition,  as  is  the 
case  with  register  3  of  this  example,  and  is  resolved  by  the  Markov  model.  Larger 
feedback  loops  manifest  themselves  as  interdependency  among  the  partitions;  for  this 
example,  we  see  that  the  partitions  for  registers  4  and  5  are  interdependent,  since  regis- 
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ter  5  is  an  input  to  the  partition  for  register  4,  and  vice  versa.  Subsection  5.1.3  explains 
how  this  type  of  feedback  is  resolved.  In  Section  5.2,  we  will  see  that  this  simple  parti¬ 
tioning  scheme  is  not  directly  applicable  to  all  datapaths;  there,  a  preprocessing  trans¬ 
formation  will  be  presented  for  use  on  some  datapaths  before  partitioning.  This  will 
enable  the  correct  modeling  of  a  more  general  class  of  circuits. 

5.1.2  Analyzing  a  single  register 

The  objective  of  analyzing  a  single  register  is  to  find  the  register’s  state  probability 
distribution  p  as  defined  in  Equation  4-2  on  page  47.  The  steps  for  finding  p  are 
described  in  the  procedure  of  Figure  5-2.  The  remainder  of  this  subsection  describes 
the  steps  of  the  procedure  in  greater  detail. 

Procedure  AnalyzeRegister 

Step  I.  Find  Q,  a  matrix  that  describes  the  register’s  state  transitions 
when  BIST  is  disabled. 

Step  2.  Modify  Q  to  compute  C,  a  matrix  that  describes  the  register’s 
state  transitions  when  BIST  is  enabled. 

Step  3.  Use  C  to  compute  p ,  a  row  vector  describing  the  steady-state 

probability  distribution  over  the  register’s  state  space. 

Figure  5-2.  Procedure  for  analyzing  a  single  register. 
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Step  1:  Computing  the  BIST-disabled  state  transition  matrix  Q 

The  first  step  in  analyzing  a  register  X  is  to  compute  a  matrix  describing  the  register’s 

transitions  from  state  to  state  during  normal  (BIST-disabled)  operation.  Let  Q  denote 

'Xl  jxj 

this  state  transition  matrix.  The  matrix  ^  is  2'  by  2  ,  where  |X|  is  the  bit  width  of 

the  register.  Q  has  elements 

q.j  =  Pr  { next  register  state  is  j  I  current  register  state  is  /}  (eq  s-i) 

for  i,j  =  0,  1, ...,  2l^l  -  1 .  The  register’s  next  state  is  written  in  terms  of  the  local  struc- 
mre  of  the  datapath,  more  specifically  in  terms  of  that  part  of  the  datapath  contained 
within  the  register’s  partition.  The  state  transition  matrix  Q  depends  on  the  functional¬ 
ity  of  the  combinational  logic  within  the  partition,  and  the  probability  distributions  of 
any  registers  that  serve  as  input  registers  to  the  partition.  The  matrix  Q  also  depends  on 
whether  there  is  direct  feedback  within  the  partition,  i.e.,  on  whether  the  output  regis¬ 
ter  feeds  back  into  itself  through  the  combinational  logic  of  the  partition.  A  register 
that  feeds  back  directly  into  itself  is  said  to  be  self-adjacent  [HuPe87].  Note  that  if  a 
register  is  not  self-adjacent,  the  next  state  of  the  register  does  not  depend  on  the  cur¬ 
rent  state,  and  the  equation  for  qij  is  reduced  to 

q.j  =  Pr  {next  register  state  is  J}  (eq  5-2) 

for  i,j  =  0,  1, ...,  2l^l  -  1 .  In  this  special  case,  all  rows  of  the  state  transition  matrix  Q 


are  identical. 
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Step  2:  Modifying  Q  to  compute  the  BIST  transition  matrix 

Step  2  of  the  procedure  AmlyzeRegister  modifies  Q,  the  register’s  state  transition 
matrix  for  the  BIST-disabled  case,  to  compute  C,  the  state  transition  matrix  for  the 
register  when  BIST  is  enabled.  The  elements  of  C  are  defined  by: 

c^j  =  Pr  {  next  register  state  is  j  I  current  register  state  is  /}  (eq  5-3) 

keeping  in  mind  that  BIST  is  enabled.  How  C  is  computed  depends  on  the  BIST  meth¬ 
odology  used.  For  example,  suppose  that  the  register  is  a  multiple  input  signature  reg¬ 
ister  (MISR).  From  the  way  that  MISRs  work  (see  Hgure  2-4  on  page  18  for  a  gate 
level  view),  we  know  that  the  next  state  of  the  register  is  the  bitwise  XOR  of  what  the 
next  state  would  be  if  BIST  were  disabled  and  a  shifted  version  of  the  register’s  cur¬ 
rent  state,  with  a  modulo-2  sum  of  the  values  at  the  feedback  taps  shifted  into  the 
newly  vacated  bit.  Suppose  that  the  register  moves  from  state  i  to  state  k  when  BIST  is 
disabled;  when  BIST  is  enabled  under  the  same  conditions,  the  register  will  move 
from  state  i  to  state  k  ©  SHL(i,  where/(0  is  the  value  of  the  feedback  bit. 

Information  about  what  the  next  state  would  be  if  BIST  were  disabled  comes  from  the 
matrix  Q,  computed  in  Step  1.  For  MISRs  and  TPGRs,  each  element  of  C  can  be 
expressed  in  terms  of  Q  by; 


(EQ  5-4) 


1.  ©  denotes  a  bitwise  XOR. 

2.  SHL(i,  b)  denotes  a  left  bit  shift  of  the  binary  representation  for  i,  with  the  bit  b  shifted  into  the  newly 
vacated  least  significant  bit. 
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The  circular  BIST  and  circular  self-test  path  cases  are  similar,  except  that  the  next  state 
of  a  register  depends  also  on  the  bit  value  being  shifted  into  the  register  from  the  circu¬ 
lar  BIST  chain.  From  the  way  that  a  circular  BIST  test  register  works  (see  Figure  2-7 
on  page  21  for  a  gate  level  view),  we  know  that  the  next  state  of  a  register  is  the  bit¬ 
wise  XOR  of  what  the  next  state  would  be  if  BIST  were  disabled  and  a  shifted  version 
of  the  register’s  current  state,  where  a  bit  from  the  preceding  register  on  the  circular 
BIST  chain  is  shifted  into  the  vacated  bit  position.  Consider  register  7  of  the  datapath 
of  Figure  5-3  as  an  example.  If  BIST  were  disabled,  register  7’s  next  state  would  be 
the  value  coming  into  register  7  from  the  data  path  (i.e.,  from  register  6).  Since  BIST  is 
enabled,  this  value  is  bitwise  XORed  with  a  shifted  version  of  the  current  state;  this 
shifted  version  is  a  right  shift  of  the  current  state,  where  the  newly  vacated  most  signif¬ 
icant  bit  is  filled  with  a  value  coming  from  the  circular  self-test  path,  namely,  the  most 
significant  bit  of  register  1. 


Figure  5-3.  An  example  RTL  datapath  using  the  circular  self-test  path  technique. 
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For  the  general  case,  let  p  denote  the  probability  that  the  bit  shifted  into  the  register 
from  the  circular  BIST  chain  is  a  1.  Since  the  bit  comes  from  either  the  most  or  least 
significant  position  of  the  preceding  register  T  (depending  on  whether  that  register  is 
shifted  to  the  right  or  to  the  left),  p  can  be  written  in  terms  of  Fs  state  probability  dis¬ 
tribution,  where  pjj  denotes  the  probability  that  register  Tis  in  state  i  (see  Equation 
4-2  on  page  47): 


r 


- 1 


p  =  Pr  {  shifted- in  bit  is  a  1 } 


1  =  2 


I  fzi 

171-1 


if  register  T  is  left-shifted 

(EQ  5-5) 


^  Pt  2i  +  1  register  T  is  right-shifted. 
V  1  =  0 

Thus,  for  a  circular  BIST  test  register,  each  element  of  C  can  be  expressed  in  terms  of 
Q  and  p  by  the  following:^ 


^ij 


©  SHLdM)  +  ( 1  -  P)  ^/,;  ©  SHL(|,0)  ^  left-shifted 

(EQ  5-6) 

^  P«i.y®SHR(,-,l)-^(‘-P>«(,y®SHR(W)  ifregisterR  is  right-shifted. 


Equation  5-6  shows  that  each  element  of  C  is  written  in  terms  of  two  different  ele¬ 
ments  of  Q.  In  fact,  each  row  of  C  is  a  linear  combination  of  two  different  permuta¬ 
tions  of  the  same  row  of  Q. 


3.  SHR(i,  b)  denotes  a  right  bit  shift  of  the  binary  representation  for  /,  with  the  bit  b  shifted  into  the 
newly  vacated  most  significant  bit. 
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Step  3:  Computing  the  register  state’s  probability  distribution 

Step  3  of  the  procedure  AnalyzeRegister  uses  C,  the  register’s  state  transition  matrix 
when  BIST  is  enabled,  computed  in  Step  2,  to  compute  the  register’s  state  probability 
distribution  in  the  steady  state.  Suppose  that  p  (t)  denotes  the  probability  distribution 
at  time  t.  The  register  state  transition  matrix  C  relates  p{t+  1)  to  p(t)  : 

p(t+  1)  =  p(t)C  . 

It’s  easy  to  see  that  C  is  a  Markov  matrix  [StraSS];  all  of  its  elements  are  nonnegative, 
and  each  row  adds  to  1.  Like  all  Markov  matrices,  C  has  an  eigenvalue  equal  to  1.  This 
means  that  the  probability  distribution  will  settle  down  to  a  steady-state;  it  is  this 
steady-state  probability  distribution  p  that  we  need.  We  have 

P  =  pC  .  (EQ  5-7) 

From  this  equation,  p  is  the  left  eigenvector  of  C  corresponding  to  eigenvalue  1.  We 
can  solve  the  equation  for  p  in  the  following  way  [Stra88]: 

T 

•  5rep A.  Compute  C  -I. 

•  Step  B.  Do  a  QR  decomposition  of  -I,  -1)  =  M. 

-T 

•  Step  C.  Solve  the  upper  triangular  system  Mp  =  0  for  p . 

It  is  always  the  case  that  the  matrix  M  is  rank  deficient,  so  that  p  has  a  non-trivial 


(non-zero)  solution. 
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5.1.3  Iterating  to  analyze  the  entire  datapath 

Two  steps  (Steps  1  and  2)  of  the  procedure  AnalyzeRegister,  given  in  Figure  5-2,  use 
information  about  other  registers  in  the  datapath,  and  assume  that  these  other  registers 
have  already  been  analyzed.  For  some  datapaths,  it  is  possible  to  analyze  the  registers 
in  such  an  order  that  the  information  needed  has  already  been  computed;  for  these 
datapaths,  each  register  need  be  analyzed  only  one  time.  However,  for  datapaths  con¬ 
taining  indirect  feedback,  it  is  impossible  to  choose  such  an  order.  For  example,  con¬ 
sider  the  datapath  fragment  of  Figure  5-4,  which  has  indirect  feedback.  Analysis  of 
register  REG5  requires  knowledge  about  register  REG4’s  probability  distribution,  and 
vice  versa.  The  situation  is  similar  for  any  datapath  using  the  circular  self-test  path 
technique,  since  each  register  in  a  circular  self-test  path  feeds  back  into  itself  indirectly 
through  the  path.  In  order  to  handle  these  cases,  all  registers  are  assigned  arbitrary  ini¬ 
tial  probability  distributions,  and  an  iterative  process,  shown  in  Rgure  5-5,  is  used  to 
repeatedly  analyze  registers  until  the  probability  distributions  have  reached  steady- 
state  values.  In  general,  there  is  no  guarantee  that  the  iterative  process  will  terminate; 
however,  in  our  experience,  the  process  does  terminate  for  practical  cases,  and  most 
datapaths  require  only  a  few  iterations. 

We  now  turn  to  a  transformation  that  can  be  used  to  process  the  datapaths  before 
applying  the  Markov  model,  making  the  model  applicable  to  a  wider  variety  of  datap¬ 
aths  by  enabling  it  to  accurately  handle  word-level  correlation.  We  describe  the  trans- 
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Figure  5-4.  An  RTL  datapath  fragment  with  indirect  feedback. 

formation  on  datapaths  at  the  register  transfer  level,  although  it  is  applicable  to  circuits 
at  the  gate  and  algorithmic  levels  as  well. 


5.2  A  Circuit  Transformation  Technique 

This  section  describes  a  technique  for  transforming  a  general  RTL  circuit  into  a  func¬ 
tionally  equivalent  RTL  circuit  with  no  reconvergent  fanout.  Probabilistic  analysis  of 
any  kind  can  be  greatly  simplified  by  assuming  that  there  is  no  correlation  among  the 
input  registers  of  a  partition,  i.e.,  that  one  input  register’s  state  is  independent  of  every 
other  input  register’s  state.  In  practice,  word-level  correlation  occurs  frequently,  and  is 
a  direct  result  of  fanout  in  the  circuit  structure.  For  example,  consider  the  circuit  of 
Figure  5-6(a).  Figure  5-6(b)  of  the  figure  shows  a  direct  partitioning  of  the  circuit. 
When  register  6  is  analyzed,  information  about  registers  4  and  5  is  used.  The  states  of 
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Figure  5-5.  Iterative  procedure  for  analyzing  an  RTL  datapath. 
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(a)  An  RTL  circuit  with 
reconvergent  fanout. 


(b)  A  direct  partitioning  of  the  circuit. 


(c)  The  circuit  transformed. 


(d)  A  partitioning  of  the  transformed  circuit. 
Figure  5-6.  A  simple  transformation  example. 
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these  two  registers  are  heavily  correlated;  both  are  functions  of  the  state  of  register  2  at 
the  previous  clock.  Any  probabilistic  analysis  operating  directly  on  the  original  circuit 
should  take  this  correlation  into  account. 

Figure  5-6(c)  and  Figure  5-6(d),  which  show  the  transformed  circuit  and  the  corre¬ 
sponding  partitions,  respectively,  demonstrate  how  the  transformation  technique  pre¬ 
sented  here  simplifies  the  probabilistic  analysis  of  a  circuit  by  enabling  us  to  express  a 
register’s  state  in  terms  of  a  set  of  uncorrelated,  statistically-independent  source  regis¬ 
ters.  The  reconvergent  fanout  is  subsumed  by  the  combinational  logic,  where  it  can  be 
taken  into  account  easily;  the  penalty  is  that  we  must  now  work  with  more  complex 
combinational  logic  blocks.  If  desired,  a  dummy  register  may  be  added  between  CLB 
C’  and  register  6  to  preserve  the  timing  of  the  circuit  by  providing  an  additional  clock 
cycle  of  delay  ;  for  many  kinds  of  probabilistic  analysis,  including  this  one,  such  tim¬ 
ing  details  are  not  important,  and  the  register  can  be  omitted.  For  the  transformed  cir¬ 
cuit,  the  inputs  to  each  partition  are  independent,  and  subsequent  probabilistic  analysis 
may  ignore  correlation. 

Note  that  the  transformation  described  in  this  section  does  not  remove  feedback  loops 
from  the  circuit.  Direct  feedback  loops  are  taken  into  account  by  the  Markov  model,  as 
previously  described  in  Subsection  5. 1.2,  and  indirect  feedback  loops  are  resolved 
through  the  iterative  process  previously  described  in  Subsection  5.1.3. 
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5.2.1  Preliminary  deflnitions 

Correlation  between  registers  is  determined  by  the  sources  of  the  registers,  and  there¬ 
fore  information  about  the  flow  of  data  through  the  circuit  is  needed.  We  say  that  reg¬ 
ister  r  is  a  parent  of  register  R  if  there  is  a  path  from  TioR  that  passes  through  only 
combinational  logic.'^  The  set  of  registers  that  are  parents  to  register  R  is  denoted 
“PiR).  Next,  we  say  that  register  7  is  a  grandparent  of  register  R  if  there  is  a  register  X 
such  that  r  is  a  parent  of  X  and  X  is  a  parent  of  R.  In  the  circuit  of  Figure  5-6(a),  regis¬ 
ter  6  has  two  parents  (registers  4  and  5)  and  three  grandparents  (registers  1,  2,  and  3). 
Since  we  must  trace  the  sources  driving  R  from  an  arbitrary  distance  up  the  circuit,  we 
now  generalize  the  concept  of  parents  in  the  following  way:  a  register  7  is  a  distance-i 
source  of  register  R  if  there  is  a  path  from  register  7  to  register  R  that  passes  through 
exactly  i  intermediate  registers.  Thus,  parents  are  distance-0  sources,  grandparents  are 
distance- 1  sources,  and  so  on.  Note  that  if  a  circuit  contains  feedback,  it  is  possible  for 
a  register  /?  to  be  a  distance-/  source  of  itself.  The  set  of  distance-/  sources  of  R  is 
denoted/^, (7?). 

Two  registers  R1  and  R2  are  said  to  have  a  common  source  if  there  is  a  register  7  and  a 
distance  /  such  that  register  7  is  a  distance-/  source  of  both  R1  and  R2.  If  two  registers 
have  a  common  source,  their  states  are  correlated,  that  is,  the  probability  that  register 
/?7  is  in  a  given  state  /  is  not  statistically  independent  of  the  probability  that  register  R2 


4.  This  includes  the  trivial  case  of  a  direct  register-to -register  connection,  for  which  the  path  from  T  to 
R  has  no  intervening  logic. 
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is  in  a  given  state  j.  Again,  the  purpose  of  the  transformation  is  to  ensure  that  when  we 
partition  a  circuit,  each  register’s  state  is  expressed  in  terms  of  a  set  of  uncorrelated 
source  registers. 

5.2.2  Finding  the  sources  of  a  register 

The  sources  of  a  given  register  R  can  be  found  using  the  simple  recursion 

(R)  =  ParentsOfRegisterSet  ^  (R) ) , 

that  is,  the  distance-/  sources  of  R  are  the  parents  of  the  distance-(/- 1)  sources  of  R.  If 
the  circuit  contains  no  feedback,  register  R  has  a  finite  number  of  generations  of 
sources;  in  tracing  up  the  circuit,  the  primary  input  registers  are  eventually  reached. 
However,  if  R  or  one  of  its  sources  is  in  a  feedback  loop,  R  has  an  infinite  number  of 
generations  of  sources.  Figure  5-7  formalizes  an  algorithm  for  finding  the  sources  of  a 
register. 


5.2.3  Transforming  and  partitioning  the  circuit 

An  algorithm  for  transforming  and  partitioning  the  circuit  is  shown  in  Figure  5-8. 
Given  a  register  R,  it  returns  information  about  R's  partition  in  the  transformed  version 
of  the  circuit.  The  main  point  here  is  that  we  would  like  to  keep  the  partitions  as  local 
as  possible.  At  the  same  time,  we  may  not  be  able  to  choose  partitions  in  the  straight¬ 
forward  way  described  in  Subsection  5.1.1,  since  we  would  like  to  be  sure  that  the 
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I  Algorithm  ParentsOfRegisterSet 

\  Input:  An  RTL  circuit  and  a  set  of  designated  registers  S. 
I  Output:  The  set  of  parents  of  S,  P(S). 

i 

1 

SetP(S)  =  0. 

for  y  =  1  to  ISI;  /*  iterate  through  registers  in  S*/ 

Let  R  be  the  register  in  the  set  S. 

Set  P(S)  =  P(S)  kjP{R)  . 

end; 


Algorithm  SourcesOfRegister 

Input:  An  RTL  circuit  with  a  designated  register  T. 

Output:  The  number  of  generations  gen  of  sources  of  register  T, 
and  the  sets  ^{(T)  for  0  <  i  <  gen . 

if  (P{T)  =  0)do; 

/*  Register  Thas  no  parents,  and  therefore  no  sources.  */ 

Set  gen  to  0. 
return; 
end; 

Set/^o(T)  =  Pin 
Initialize  gen  to  1. 
while  (true)  do; 

/*  Trace  back  another  level  in  the  circuit.  */ 
if  {ParentsOfRegisterSet  _  i  (T) )  =0)  then  do; 

return;  /*  We  have  traced  back  to  the  primary  inputs.  */ 
end; 
else  do; 

Set  ( T)  =  ParentsOfRegisterSet  {^ge„  _  i  ( 7^  )  • 
increment  gen-, 

if  (>^gen  -  1  ( 7)  =  ( 71  for  somey,  0<j<gen-l  )  do; 

/*  We  have  traced  all  the  way  around  a  feedback  loop.  */ 
return; 
end; 
end; 
end; 

Figure  5-7.  Algorithm  to  find  the  sources  of  a  register. 
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input  registers  to  the  partition  have  statistically-independent  states.  The  algorithm 
starts  with  a  straightforward,  small  partition,  and  enlarges  it  one  step  at  a  time  as  nec¬ 
essary  until  all  correlation  among  the  input  registers  of  the  partition  is  eliminated. 

As  the  algorithm  proceeds,  it  must  keep  track  of  the  expression  for  the  state  of  the  out¬ 
put  register  in  terms  of  the  states  of  the  input  registers  to  the  partition.  This  expression 
is  most  easily  stored  as  a  rooted  binary  tree  with  the  internal  nodes  of  the  tree  repre¬ 
senting  the  operations  performed  by  the  combinational  logic  blocks,  and  the  leaves  of 
the  tree  representing  the  registers  serving  as  inputs  to  those  operations.  Such  a  tree  is 
commonly  called  an  expression  tree.  When  the  expression  tree  is  traversed  in  depth- 
first  order,  the  expression  is  created  in  infix  notation.  Let  ^denote  the  set  of  registers 
serving  as  leaves  of  the  tree;  these  registers  are  the  input  registers  to  the  partition. 

As  an  example  of  the  transformation  algorithm  and  the  use  of  the  expression  tree,  con¬ 
sider  the  circuit  of  Figure  5-6(a).  We  would  like  to  find  the  proper  partition  for  register 
6.  We  start  by  expressing  the  state  of  register  6  as  a  function  of  the  states  of  register  6’s 
parents,  as  shown  in  Rgure  5- 10(a).  At  this  point,  the  set  of  input  registers  to  our  par¬ 
tition  is  {R4,  R5}.  We  now  ask  whether  we  have  written  the  state  of  register  6  in 
terms  of  a  set  of  statistically-independent  registers;  the  answer  is  no,  since  registers  4 
and  5  have  a  source  in  common  (register  2  is  a  distance-0  source  of  both  register  4  and 
register  5).  As  a  result,  we  expand  our  expression  tree  around  registers  4  and  5  using 
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!  Algorithm  TransformAndPartition 
Input:  An  RTL  circuit  with  a  designated  register  R. 

Output:  Information  about  register  /?’s  partition  in  a  transformed  version  of  the 
circuit,  specifically: 

I  the  set  of  input  registers  for  the  partition, 

i  S-  an  expression  tree  for  the  state  of  R  written  as  a  function  of  the 

1  states  of  the  registers  in 

j 

I  if  (P(/?)  =  0)  then  do; 

/*  In  the  special  case  that  R  has  no  parents,  return  a  trivial  partition.  */ 

Set^  =  0  and 5=  NIL. 
return; 
end; 

I  /*  Let  designate  the  input  registers  to  the  partition;  start  by  initializing 
I  the  expression  tree  for  R’s  state  so  that  it  expresses  R's  state  as  a  function 
of  the  states  of  R’s  parents.  */ 

Set  ^  to  a  single  leaf  node,  register  R. 

S  =  ExpandExpressionTree  (P{R)).  I*  See  Figure  5-9  and  Figure  5-10.  */ 
^=PiR). 

while  (true)  do; 

if  (two  or  more  registers  in  ^  have  a  common  source)  then  do; 

Let  r  denote  the  common  source,  and  let  i  denote  the  distance. 

Let  e  ^  be  the  set  of  registers  for  which  register  7  is  a 
distance-/  source. 

/*  Update  the  partition  input  set  by  replacing  the  correlated  registers 
with  their  parents.  */ 

u  ParentsOfRegisterSet  ( ^^^^)  • 

/*  Update  the  expression  for  R's  state  to  be  in  terms  of  the  new 
partition  input  set.  */ 

S  =  ExpandExpressionTree  i^corr)-  See  Figure  5-9  and  Figure  5-10.  */ 
end; 

/*  R  is  written  in  terms  of  a  set  of  uncorrelated  source  registers.  */ 
else  return; 
end; 


Figure  5-8.  Algorithm  to  transform  and  partition  a  circuit. 
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Algorithm  ExpandExpressionTree 

Input:  An  expression  tree  S  and  a  set  of  registers  S. 

Output:  A  modified  expression  tree  (expanded  around  the  registers  in  S). 

for  each  leaf  register  L  in  the  tree  do; 
if  (L  e  S)  then  do; 

/*  Expand  the  tree  around  this  leaf.  */ 

Replace  this  leaf  of  the  tree  with  a  subtree  expressing  the  state  of  L 
in  terms  of  the  states  of  the  registers  in  P(L). 
end; 
end; 

Figure  5-9.  Algorithm  to  expand  an  expression  tree. 


(a)  The  initial  expression  ...  .  a  a  a  c 

^  (b)  expansion  around  registers  4  and  5. 

Figure  5-10.  Expansion  of  an  expression  for  register  6. 


the  ExpandExpressionTree  algorithm  of  Figure  5-9;  these  expansions  are  shown  in  part 
(b)  of  Figure  5- 10.  Our  new  input  register  set  is  [Rl,  R2,  R5}.  At  this  point,  we  are 
done,  since  there  is  no  correlation  among  the  registers  of 

5.2.4  A  more  complex  transformation  example 

This  subsection  provides  a  detailed  example  of  the  transformation  and  partitioning 
algorithm  by  fmding  the  proper  partition  for  register  1  of  the  circuit  of  Figure  5-11. 
The  circuit  is  shown  completely  transformed  and  partitioned  in  Rgure  5-12.  When 
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Figure  5-11.  An  example  RTL  circuit. 


finding  the  partition  for  register  1,  the  algorithm  begins  by  expressing  the  state  of  reg¬ 
ister  1  as  a  function  of  the  states  of  its  distance-0  sources,  registers  6  and  8;  this  initial 
partition  is  shown  in  Figure  5-  13(a).  It  then  determines  that  registers  6  and  8  are  corre¬ 
lated  sources,  since  they  have  a  distance-3  source  in  common;  register  1  drives  both 
register  6  and  register  8  via  paths  (REGl  REG8  — >  REGl  REG3  REG6) 
and  (REGl  ->  REG3  REG6  REGl  REG8),  respectively.  As  a  result,  the 
expression  for  the  state  of  register  1  is  expanded  around  registers  6  and  8,  to  get  the 
partition  shown  in  Figure  5-  13(b).  Again,  the  algorithm  looks  for  correlation  among 
the  input  registers  of  the  partition,  and  discovers  that  registers  8  and  2  are  correlated; 
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register  2  is  a  distance-0  source  of  both  registers.  As  a  result,  the  expression  is 
expanded  around  registers  8  and  2,  giving  the  partition  of  Rgure  5-  13(c).  After  deter¬ 
mining  that  no  two  input  registers  to  this  partition  are  correlated,  the  algorithm  returns 
this  partition  as  the  final  partition. 


(a)  (b) 


(c) 

Figure  5- 13.  The  steps  in  finding  a  partition  for  register  REGl  of  the  circuit  of 

Figure  5-11. 
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Note  that  the  input  registers  to  the  partition  are  sources  of  varying  distance  from  regis¬ 
ter  1.  Register  3  is  a  distance-/  source  of  register  1,  while  register  12  is  a  distance-2 
source.  Thus,  the  partition  expresses  the  state  of  register  1  at  time  r  as  a  function  of  the 
states  of  register  3  at  time  t-2  and  register  12  at  time  r-3.  This  does  not  present  a  prob¬ 
lem  for  the  analysis,  since  we  deal  with  steady-state  probability  distributions,  which 
do  not  change  from  time  step  to  time  step.  Dummy  registers  can  be  inserted  for  other 
applications  for  which  the  timing  must  be  preserved. 


5.3  Computational  Complexity  of  the  Analysis 

The  overall  execution  time  for  analyzing  a  circuit  is  dominated  by  the  time  taken  for 
the  Markov  analysis;  the  preprocessing  transformation  is  very  fast  in  comparison  (less 
than  a  second  of  CPU  time  for  the  examples).  We  consider  two  components  of  the 
analysis  separately:  how  the  time  for  the  overall  analysis  process  described  in  Subsec¬ 
tion  5.1.3  grows  with  the  number  of  registers  in  the  circuit,  and  how  the  time  taken  for 
the  procedure  analyzing  a  single  register  (Procedure  AnalyzeRegister,  given  in  Figure 
5-2)  grows  with  the  bit  width  of  the  register. 

The  overall  analysis  process  is  iterative  in  nature.  A  single  iteration  analyzes  each  reg¬ 
ister  in  the  circuit  at  most  once,  and  so  the  time  per  iteration  grows  linearly  with  the 
number  of  registers.  An  exact  complexity  analysis  of  the  overall  process  has  not  been 
done  because  determining  the  number  of  iterations  needed  is  difficult,  and  would 
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require  a  numerical  analysis  technique.  In  general,  circuits  with  indirect  feedback  take 
the  most  time,  because  they  require  more  iterations.  In  our  experience,  most  circuit 
require  only  a  few  iterations. 

The  time  required  to  analyze  a  single  register  grows  exponentially  with  the  size  of  the 
register;  this  growth  is  inherent  to  the  problem,  since  each  possible  register  state  must 
be  examined.  It  should  be  noted  that  for  real  circuits,  registers  are  of  fixed  size,  and  so 
it  is  important  only  that  the  analysis  procedure  be  tractable  for  that  size  register.  The 
overall  execution  time  is  reasonably  fast  for  the  four  bit-  and  nine  bit-wide  register 
transfer  level  datapaths  used  as  examples  in  Chapter  Eight;  our  slowest  example 
required  less  than  four  seconds  of  CPU  time  on  a  SUN  SPARCstation  IPC  with  36  MB 
of  memory.  However,  the  procedure  as  it  stands  now  is  slow  for  larger  registers,  e.g. 
16  or  32  bits.  It  is  hoped  that  further  research  will  find  ways  to  speed  up  the  analysis. 
Even  as  it  stands  now,  the  model  provides  insight  into  test  quality,  and  thus  is  useful  in 
validating  faster  heuristics;  our  research  is  exploring  heuristics  to  compute  the  ran¬ 
domness  and  transparency  metrics  directly,  without  reliance  on  the  underlying  vectors 


and  matrices. 
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5.4  The  Markov  Model  at  the  Gate  Level 

Up  to  this  point,  this  chapter  has  described  a  Markov  model  for  the  analysis  of  register 
transfer  level  circuits.  We  now  shift  focus  to  examine  the  special  case  when  we  have  a 
gate  level  circuit  instead.  We  start  by  grouping  the  individual  gates  of  the  gate  level 
circuit  into  combinational  logic  segments,  so  that  we  can  produce  a  “pseudo-RTL” 
view  of  the  circuit.  At  this  point,  the  procedure  for  calculating  the  1-probability  of  one 
flip-flop  in  a  gate  level  circuit  is  analogous  to  calculating  the  probability  distribution 
of  a  register  in  a  RTL  circuit,  shown  in  Figure  5-2.  The  only  differences  are  that  at  the 
gate  level,  signals  are  one  bit  wide,  so  the  probability  distribution  for  a  signal’s  state, 
which  is  represented  as  a  vector  at  the  register  transfer  level,  can  be  represented  with  a 
single  number,  the  1 -probability  for  the  signal.  In  addition,  the  state  transition  matrices 
used  by  the  Markov  model  are  now  always  two  by  two.  These  differences  mean  that 
the  Markov  model  analysis  done  at  the  register  transfer  level  can  be  replaced  with  sim¬ 
ple  scalar  equations  at  the  gate  level.  These  equations  provide  analytical  values  for  the 
1-probability  of  each  flip-flop  in  the  circuit;  entropies  may  be  derived  from  the  1-prob- 
abilities  using  Equation  4-1  on  page  47. 

This  section  begins  by  describing  how  to  create  the  “RTL”  view  of  the  gate  level  cir¬ 
cuit.  It  then  describes  a  computation  method  for  the  1-probabilities  of  the  signals 
throughout  a  circuit.  Finally,  it  shows  how  this  computation  method  is  derived  from 
the  more  general  Markov  model  of  Chapter  Five. 
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5.4.1  Creating  a  “RTL”  view  of  a  gate  level  circuit 

Before  we  can  do  a  probabilistic  analysis  of  a  gate  level  circuit,  we  must  identify  the 
cone  of  logic  that  is  driving  each  flip-flop  and  primary  output  of  the  circuit.  This 
allows  us  to  create  a  “RTL”  view  of  the  gate  level  circuit.  Actually,  “flip-flop  transfer 
level”  may  be  a  more  appropriate  term,  since  the  components  of  our  “RTL”  view  are 
single  flip-flops  rather  than  registers,  and  combinational  logic  cones,  or  blocks  with  a 
single  bit  output  rather  than  typical  combinational  logic  blocks  that  have  more  ttian 
one  bit  of  output.  Note  that  if  the  output  of  a  gate  fans  out,  it  may  drive  more  than  one 
of  the  flip-flops  in  the  circuit.  In  this  case,  that  gate  will  appear  in  more  than  one  of  the 
logic  cones  in  the  “RTL”  view.  An  example  of  this  process  will  be  given  in  Chapter 
Seven,  when  we  use  Markov  analysis  to  do  BIST  insertion  in  a  gate  level  submodule 
of  an  industrial  design. 

As  an  aside,  we  mention  here  some  recent  work,  presented  in  [PaBN94],  that  derives  a 
true  register  transfer  level  circuit  from  a  gate  level  circuit.  The  method  begins  in  the 
same  way  we  do,  by  grouping  the  gates  into  the  logic  cones  that  drive  each  flip-flop 
and  primary  output.  It  then  attempts  to  look  for  logic  cones  that  have  similar  functions 
so  that  it  can  group  individual  flip-flops  into  registers  in  a  meaningful  way.  For  exam¬ 
ple,  if  it  notices  that  a  number  of  the  logic  cones  are  bit  slices  of  an  adder,  it  will  group 
those  bit  slices  together,  joining  the  flip-flops  at  the  outputs  of  the  cones  into  a  register. 
Note  that  even  in  the  general  case,  when  there  is  no  clear  relationship  between  the 
logic  cones,  it  is  always  possible  to  group  flip-flops  together  to  do  a  true  register  trans- 
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fer  level  analysis  by  allowing  general  “arithmetic  logic  units.”  However,  this  work  is 
most  meaningful  if  there  is  some  underlying  register  transfer  level  structure  inherent  to 
the  gate  level  circuit. 

5.4.2  1-probability  computation 

This  subsection  presents  a  procedure  for  the  computation  of  1 -probabilities  of  various 
signal  lines  in  a  sequential  circuit.  Note  that  if  the  circuit  were  exclusively  combina¬ 
tional,  we  could  find  1-probabilities  for  the  lines  of  the  network  using  the  signal  prob¬ 
ability  algorithm  of  Parker  and  McCluskey  (see  [PaMc75],  [BaMS87]).  This 
algorithm  expresses  the  1-probability  at  the  output  of  a  combinational  logic  block  as  a 
function  of  the  1-probabilities  of  the  inputs  of  the  combinational  logic  block.  The  key 
to  the  algorithm  are  input/output  signal  probability  relationships  for  the  basic  gates, 
shown  in  Table  5-1.  However,  since  we  will  be  working  with  sequential  circuits,  we 
will  require  more  than  input-output  relations  for  gates;  we  must  also  be  concerned 
with  the  1 -probabilities  of  the  values  stored  in  the  flip-flops,  since  the  signal  probabil¬ 
ities  depend  not  only  on  the  random  inputs,  but  also  on  the  state  of  the  flip-flops  of  the 
CUT.  The  Markov  model  expresses  the  next  state  of  a  flip-flop  of  the  circuit  as  a  func¬ 
tion  of  the  current  state.  As  we  shall  see  in  the  next  subsection,  computing  the  flip-flop 
1 -probabilities  necessitates  classifying  the  flip-flops  in  terms  of  whether  they  feed 
back  directly  into  themselves,  and  whether  they  are  test  registers  included  in  the  circu¬ 
lar  seif-test  path  (for  circuits  using  the  circular  BIST  methodology).  The  Markov 


89 


model  yields  the  set  of  formulas  for  the  computation  of  flip-flop  1 -probabilities  shown 
in  Table  5-2,  where  the  notation  used  is  defined  in  Table  5-3.  The  signal  probabilities 
PCL'  PcLO  PCLi  computed  using  the  Parker-McQuskey  algorithm. 

We  would  like  to  underscore  one  fact  about  the  formulas  of  Table  5-2  for  circular 

BIST  test  flip-flops.  Note  that  in  both  the  feedback  and  no  feedback  cases,  the  1-prob- 

1 

ability  of  a  circular  BIST  test  flip-flop  is  equal  to  2  as  long  as  pq,  the  1-probability  of 
the  preceding  flip-flop  in  the  circular  BIST  chain,  is  2  •  In  the  case  of  circular  BIST, 
this  means  that  if  the  first  flip-flop  in  the  chain  of  test  flip-flops  is  driven  by  a  good 
quality  test  pattern  generator,  all  of  the  flip-flops  in  the  chain  will  generate  highly 
pseudorandom  test  patterns,  regardless  of  the  nature  of  the  circuit  under  test.  For  the 
circular  self- test  path  technique,  for  which  the  ends  of  the  chain  of  test  flip-flops  are 
connected  to  form  a  circle,  if  one  flip-flop  in  the  circular  self-test  path  has  good  ran¬ 
domness  properties,  all  of  the  flip-flops  in  the  circular  self-test  path  will  have  good 
randomness  properties. 

5.4.3  An  example  of  1 -probability  computation 

This  subsection  presents  a  small  example  of  the  1-probability  computation.  The  exam¬ 
ple  circuit  of  Figure  5-i6(a),  which  builds  a  JK  flip-flop  out  of  a  D  flip-flop  and  some 
combinational  logic,  has  one  flip-flop  that  feeds  back  directly  into  itself. 
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gate  type 

input  1-probabilities 

output  1-probability 

^-input  AND 

P[,P2^-,Pk 

k 

Op, 

/=  1 

^:-input  OR 

P\,P2^...-,Pk 

k 

i-  n  (i-p,) 

i  =  1 

^-input  NAND 

Ph  Pk 

k 

1-  YlPi 

1  =  1 

A:-input  NOR 

Pl,P2,-,Pk 

k 

n  (i-p,) 

/=  1 

2-input  XOR 

PhPl 

Pl+P2-2piP2 

NOT 

P 

l-p 

Table  5-1.  Input/output  signal  probability  relations  (for  combinational  logic). 

The  first  step  is  1-probability  computation  for  the  state  of  the  flip-flop.  Since  the  cir¬ 
cuit  has  direct  feedback,  two  conditional  1-probabilities,  Pdo  and  PcLh  corresponding 
to  cases  in  which  the  feedback  has  value  ‘0’  and  value  ‘  1’,  respectively,  are  computed. 
The  appropriate  circuit  for  the  computation  of  pquq  is  shown  in  Hgure  5- 16(b);  the 
annotated  1-probabilities  of  the  signal  lines  come  from  a  straightforward  application 
of  the  gate  relationship  formulas  of  Table  5- 1 .  For  example,  since  the  inverter  on  the  K 
line  has  an  input  1-probability  of  0.5,  its  output  probability  is  (1  -  0.5)  or  0.5.  The  only 
exception  is  when  the  inputs  to  a  gate  are  correlated  due  to  reconvergent  fanout;  then. 
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flip-flop  type 

formula  for  1 -probability 

no-feedback,  normal 

PCL 

no-feedback,  circular  BIST 

Pg  -  ^PgPcl  +  PCL 

feedback,  normal 

PCLO 

^  ^Pclo~PclO 

feedback,  circular  BIST 

Pg  ~  ^Pg^clo  PcLo 

^  ^PcL0~PcLl^  ~^Pg^PcLO~Pcl0 

Table  5-2.  Formulas  for  computation  of  flip-flop  1 -probabilities  (for  sequential 


logic). 


notation 

meaning 

PCL 

for  a  flip-flop  F  without  direct  feedback,  the  signal  prob¬ 
ability  for  the  output  of  the  combinational  logic  driving 
the  flip-flop.  See  Figure  5- 14(b). 

PCLO 

for  a  flip-flop  F  with  direct  feedback,  the  conditional 
signal  probability  for  the  ouq)ut  of  the  combinational 
logic  driving  the  flip-flop,  given  that  the  feedback  line 
has  value  ‘O’.  See  Figure  5- 15(b). 

PCLl 

1 

for  a  flip-flop  F  with  direct  feedback,  the  conditional 
signal  probability  for  the  output  of  the  combinational  i 

logic  driving  the  flip-flop,  given  that  the  feedback  line 
has  value  ‘  1’ .  See  Figure  5-  15(c). 

i 

Pg 

i 

for  a  test  flip-flop  F,  the  1 -probability  of  the  flip-flop  G 
that  precedes  F  in  the  circular  self-test  path. 

Table  5-3.  Notation  for  1-probability  formulas. 
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^k-l 


^CL 


(a)  a  flip-flop  that  does  not  feed  back  (b)  the  appropriate  model  for  calculation 
into  itself.  of  p^. 

Figure  5- 14.  The  no  feedback  case. 


r 

« 

combinational 

flip- 

• 

• 

logic 

flop 

block 

> 

V.  J 

(a)  a  flip-flop  that  feeds  back  into  itself 


Pi  ^ 

C  \ 

Pi 

n 

r  ^ 

^2  — 

combinational 

Pi 

• 

combinational 

• 

• 

logic 

^  PcLD 

logic 

Pk.l  — 

block 

Pk-I  ^ 

block 

D  -0 

P.=  1  " 

V  .  J 

(b)  the  appropriate  model  for  the  (c)  the  appropriate  model  for  the 
calculation  of  PcLO-  calcuktion  of  Pcu. 

Figure  5-15.  The  feedback  case. 
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(a)  a  circuit  implementing  a  JK  flip-flop. 


(b)  example  calculation  of  Pcw 
Figure  5-16.  An  example  calculation. 


a  larger  portion  of  the  circuit,  with  uncorrelated  inputs,  must  be  considered.  This  is  the 
case  for  the  OR  gate;  to  find  the  1 -probability  at  the  output  of  the  OR  gate,  we  must 
write  a  truth  table  showing  the  ou^ut’s  value  as  a  function  of  J,  K,  and  the  feedback 
line.  Each  line  of  the  truth  table  corresponds  to  a  different  combination  of  values  for  J, 
K,  and  the  feedback  line;  we  can  find  the  probability  that  a  particular  combination 
occurs  from  the  individual  1-probabilities  of  J,  K,  and  the  feedback  line.  For  example, 
the  probability  that  combination  (J,  K,  feedback)  =  (‘F,  ‘O’,  ‘1’)  occurs  is 


Pj  ^  ^  Pk^  ^feedback ' 

The  1 -probability  at  the  output  of  the  OR  gate  is  found  by  summing  the  probability  of 
occurrence  of  all  those  input  combinations  that  result  in  a  output  of  ‘  F. 
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From  the  figure,  we  see  that  Pcuo^  the  1-probability  at  the  output  of  the  combinational 
logic  segment,  has  value  0.5.  A  similar  computation,  this  time  setting  the  1-probability 
of  the  feedback  line  to  1,  gives  a  value  of  0.5  for  pcn-  Finally,  the  1 -probability  of  the 
flip-flop  is  computed  by  using  the  appropriate  formula  from  Table  5-2: 

_ PcLO _ .5 

"  1+  (Z'cLO-^CLl)  ~  1+ 

Once  pp,  the  1-probability  for  the  flip-flop,  has  been  computed,  the  (unconditional)  1- 
probabilities  for  all  signal  lines  can  be  recomputed  using  the  Parker-McCluskey  algo¬ 
rithm,  this  time  using  the  actual  value  for  pp  instead  of  a  constant  0  or  1.  These  final  1- 
probabilities  are  shown  annotated  to  each  signal  line  of  Figure  5- 16(a). 

5.4.4  Iterating  to  analyze  the  entire  circuit 

As  was  the  case  with  register  transfer  level  circuits,  an  iterative  procedure  must  be 
used  to  resolve  indirect  feedback.  For  exclusively  feed-forward  circuits,  only  one  iter¬ 
ation  is  needed,  but  for  circuits  with  circular  BIST  or  indirect  feedback  loops,  an  itera¬ 
tive  process  analogous  to  that  of  Figure  5-5  is  used  to  assign  arbitrary  initial  1- 
probabilities  to  the  flip-flops  and  then  repeatedly  analyze  flip-flops  until  the  1-proba¬ 
bilities  have  reached  steady-state  values. 
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5.4.5  A  note  about  computational  complexity 

The  original  Parker-McQuskey  algorithm  for  signal  probability  calculation  has  com¬ 
plexity  that  grows  exponentially  with  the  size  of  the  circuit  in  the  worst  case.  An 
extension  by  Savir  et.  al.  [BaMS87]  reduces  complexity  by  cutting  reconvergent 
fanout  in  the  circuit,  thereby  turning  the  circuit  into  a  tree.  In  doing  so,  exactness  is 
sacrificed;  the  cutting  algorithm  produces  upper  and  lower  bounds,  rather  than  exact 
signal  probabilities.  For  the  industrial  subcircuit  that  we  use  as  an  example  in  Chapter 
Seven,  we  were  able  to  apply  the  original,  exact  algorithm;  however,  for  larger  cir¬ 
cuits,  a  technique  like  Savir’s,  which  has  complexity  that  grows  linearly  with  the  size 
of  the  circuit,  may  be  more  appropriate. 

The  next  subsection  provides  analytical  derivations  for  the  1-probability  formulas  of 
Table  5-2  in  terms  of  a  Markov  model  for  the  state  of  the  flip-flops  in  the  sequential 
circuit. 


5.4.6  Derivation  from  the  Markov  model 

This  subsection  presents  a  mathematical  derivation  for  the  formulas  of  Table  5-2  in 
terms  of  a  Markov  model.  The  Markov  model  describes  the  operation  of  a  sequential 
circuit  with  circular  BIST,  using  an  “RTL”  view  of  the  circuit.  The  major  difference 
that  distinguishes  the  Markov  model  at  the  gate  level  from  the  Markov  model  at  the 
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register  transfer  level  is  that  at  the  gate  level  all  state  transition  matrices  are  two  by 
two  and  each  signal’s  state  probability  distribution  vector  can  be  replaced  with  a  single 
number,  the  1-probability  of  the  signal. 


Step  l.Q  ior  gate  level  circuits 

The  first  step  in  analyzing  a  flip-flop  is  to  compute  a  matrix  Q  describing  the  flip- 
flop’s  transitions  from  state  to  state  during  normal  (BIST-disabled)  operation.  The 
form  of  the  matrix  Q  depends  on  whether  the  flip-flop  feeds  directly  back  into  itself. 
For  a  flip-flop  F  that  does  not  feed  back  directly  into  itself  (see  Figure  5- 14),  we  have 
simply  that: 


q.j  =  Pr  {next  flip-flop  state  isy} 

=  F*r  {the  combinational  logic  driving  F  has  output  value  j} 

for  ie  {0,  1}  .  This  probability  can  be  found  by  applying  the  Parker-McCluskey  algo¬ 
rithm  to  the  combinational  logic  driving  F  [BaMS87];  thus,  we  have: 


Q  = 


^  PCL  PCL 
^~PCL  PCL 


(EQ  5-8) 


where  p^i  is  the  1 -probability  for  the  output  of  the  combinational  logic  driving  flip- 
flop  F. 


A  flip-flop  F  driven  directly  by  a  primary  input  is  a  special  case  of  the  no-feedback 
class.  Since  during  test  flip-flop  F  is  driven  directly  by  a  TPGR  that  generates  random 
patterns  with  a  uniform  distribution,  F  has  state  transition  matrix 
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This  can  easily  be  modified  to  accommodate  the  use  of  weighted  random  patterns 
[BaMS87];  if,  for  example,  a  particular  pseudorandom  input  is  weighted  so  that  it  has 
value  ‘  r  three-quarters  of  the  time,  the  proper  state  transition  matrix  is; 


Q  = 


.25 

.25 


.75 

.75 


Flip-flops  that  feed  back  directly  into  themselves  must  be  handled  differently.  Con¬ 
sider  a  flip-flop  F  that  feeds  directly  back  into  itself  (see  Figure  5-15).  As  in  the  previ¬ 
ous  case,  the  flip-flop  f”s  new  state  comes  from  the  output  of  the  combinational  logic 
driving  F.  The  computation  of 


q.j  =  Pr  {next  flip-flop  state  is  j  I  current  flip-flop  state  is  i} 

requires  two  separate  applications  of  the  Parker-McCluskey  algorithm,  one  in  which 
the  feedback  has  been  assigned  a  constant  value  of  ‘0’  (yielding  Pcu))^  one  in 
which  the  feedback  has  been  assigned  a  constant  value  of  ‘  V  (yielding  Pcu)-  We  then 
have: 


^  PCLO  PCLO 
^~PCLl  PCLl 


(EQ  5-9) 
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Step  2,  C  for  gate  level  circuits 

Step  2  of  the  analysis  procedure  of  Figure  5-2  modifies  Q,  the  flip-flop’s  state  transi¬ 
tion  matrix  for  the  BIST-disabled  case,  to  compute  C,  the  state  transition  matrix  for  the 
flip-flop  when  BIST  is  enabled.  Note  that  if  the  flip-flop  is  not  in  the  circular  self-test 
path,  in  test  mode  it  operates  exactly  as  it  does  in  normal  mode,  and  therefore  C  -Q. 
For  circular  BIST  test  flip-flops,  the  elements  of  C  are  obtained  by  application  of 
Equation  5-6: 

C  =  ,  (EQ5-10) 

^10^^  ”^G^  ■'■^ll^G  ^lO^G  ^11  ^  ^  ~^G^ 

where  pq  is  the  1-probability  of  the  preceding  flip-flop  G  in  the  chain. 

Step  3,  Deriving  the  1- probability 

Step  3  of  the  analysis  procedure  of  Figure  5-2  uses  C,  the  flip-flop’s  state  transition 
matrix  when  BIST  is  enabled,  computed  in  Step  2,  to  compute  the  flip-flop’s  1-proba- 
bility  in  the  steady  state.  All  of  the  analysis  done  at  the  register  transfer  level  applies 
here  as  well;  however,  since  C  is  two  by  two,  it  is  possible  to  write  the  1-probability  as 
a  simple  function  of  the  elements  of  C,  rather  than  in  terms  of  a  QR  decomposition. 
Suppose  diat  denotes  F’s  1-probability  in  the  steady  state.  Equation  5-7  becomes: 


”1 

1 

1 

11 

- 1 

1 

^00 

^01 

_  10 

^11 

In  solving  the  system,  we  find  that 
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_  01 

l-Cji  +  Coi 


1  -  C 


00 


1  Cqq+Cjq 


(EQ  5-11) 


Note  that  we  have  two  equations  in  one  unknown;  it  is  always  the  case  that  C  is  rank 
deficient,  so  that  both  equations  can  be  solved  with  the  same  value  of  pf. 


The  formulas  of  Table  5-2  are  found  by  simple  substitution  of  the  values  for  the  ele¬ 
ments  of  C  into  Equation  5-8  for  the  four  separate  classes  of  flip-flops.  For  example, 
consider  a  normal  (non-test)  flip-flop  that  does  not  feed  back  into  itself.  Since  C  =  Q  in 
this  case,  we  substitute  the  elements  of  Q,  shown  in  Equation  5-8,  into  Equation  5-11: 


Pp  - 


'01 


1  Cjj-I-Cqj 


PcL 

^-PCL^PCL 


~  PcL  ■ 


5.5  The  Markov  Model  at  the  Algorithmic  Level 

This  section  describes  how  the  same  Markov  model  that  was  used  in  Section  5. 1  to 
analyze  circuits  at  the  register  transfer  level  (RTL)  can  be  used  to  analyze  algorithmic 
descriptions  of  circuits  in  the  behavioral  domain.  Such  descriptions  consist  of  signals 
and  operators,  and  are  commonly  depicted  using  data  flow  graphs',  Hgure  5-  17(a) 
shows  a  data  flow  graph  for  the  behavior  ax+b. 

We  do  our  data  flow  graph  analysis  by  mapping  the  data  flow  graph  into  an  equivalent 
RTL  circuit  and  analyzing  the  RTL  circuit  using  the  Markov  chain  model  already 
described.  We  begin  the  mapping  by  choosing  a  schedule  for  the  data  flow  graph;  this 
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(a)  data  flow  graph.  (b)  RTL  structure. 

Figure  5- 17.  One-to-one  mapping  between  a  scheduled  data  flow  graph  and  the 
RTL  structure  used  to  analyze  the  graph. 


requires  assigning  the  operator  nodes  of  the  data  flow  graph  to  time  steps.  Our  choice 
for  the  data  flow  graph  of  Figure  5-  17(a)  is  designated  on  the  figure  by  separating  the 
time  steps  with  broken  lines.  Note  that  there  are  many  ways  to  schedule  a  given  data 
flow  graph;  it  does  not  matter  which  one  we  choose,  because  the  results  of  the  analysis 
will  be  the  same  for  all  of  them.  Figure  5-  17(b)  shows  the  RTL  mapping  for  the  exam- 
ple.The  mapping  replaces  each  operator  node  in  the  data  flow  graph  with  an  arithmetic 
logic  unit  (ALU).  Each  edge  of  the  graph,  which  represents  an  n-bit  wide  signal,  is 
mapped  to  a  number  of  n-bit  wide  registers  in  series.  The  number  of  registers  is  deter¬ 
mined  by  counting  the  number  of  time  step  boundaries  that  the  edge  crosses.  For 
example,  the  signal  coming  from  primary  input  b  crosses  over  two  time  step  bound¬ 
aries  before  it  enters  the  adder,  so  it  is  mapped  to  two  registers  in  the  RTL  circuit, 
labelled  b  and  b’.  The  mapping  is  done  this  way  so  that  correlation  due  to  reconvergent 
fanout  in  the  data  flow  graph  is  preserved  in  the  register  transfer  level  circuit.  Note  that 
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this  mapping  is  solely  for  the  purpose  of  analysis;  we  do  not  propose  physically  imple¬ 
menting  the  circuit  this  way.  We  then  analyze  the  RTL  circuit,  and  transfer  the  ran- 
donmess,  expected  state  coverage,  and  transparency  metrics  from  the  registers  of  the 
RTL  circuit  back  to  the  edges  of  the  data  flow  graph. 


5.6  Summary 

A  Markov  model  for  the  analysis  of  conventional  and  circular  BIST  was  presented. 
The  model  is  applicable  at  a  variety  of  levels  of  design  abstraction;  mathematical 
details  were  provided  for  the  gate  level,  the  register  transfer  level,  and  the  algorithmic 
level.  In  addition,  a  transformation  technique,  used  to  process  the  circuit  before  apply¬ 
ing  the  Markov  model,  was  presented;  this  transformation  technique  strengthens  the 
analysis  by  allowing  accurate  modeling  of  the  effects  of  word-level  correlation  within 
the  circuit.  The  model  will  be  used  to  provide  analytical  measures  for  our  testability 
metrics:  randomness,  expected  state  coverage,  and  transparency.  Chapter  Eight  will 
include  a  number  of  examples  for  which  the  analytical  values  for  the  testability  met¬ 
rics  from  the  Markov  model  are  compared  to  actual  results  obtained  from  simulation, 
to  the  end  of  showing  that  the  Markov  model  accurately  captures  what  happens  within 


a  circuit. 
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The  testability  metrics  computed  using  the  Markov  model  will  be  used  throughout  the 
rest  of  this  dissertation.  In  Chapter  Six,  randomness  and  expected  state  coverage  will 
be  used  as  tools  to  examine  some  structural  difficulties  inherent  to  the  circular  self-test 
path  technique.  In  Chapters  Seven,  Eight,  and  Nine,  the  testability  metrics  will  be  used 
to  guide  BIST  insertion  at  the  gate  level,  register  transfer  level,  and  algorithmic  level, 
respectively.  The  testability  metrics  provide  a  means  of  measuring  the  effect  of  a  test 
insertion  decision  (concerning  which  signals  should  correspond  to  test  registers  or  test 
flip-flops)  on  test  quality. 


/ 


Chapter  6 


Structural  Constraints  on 
Circular  Self-Test  Paths 


This  chapter  develops  constraints  on  the  structure  of  circular  self-test  paths  in  register 
transfer  level  (RTL)  circuits  with  circular  built-in  self-test  (BIST)  features.  The  con¬ 
straints  arise  from  the  desire  to  avoid  bit-level  correlation,  which  can  have  a  devastat¬ 
ing  effect  on  test  quality.  Two  causes  of  bit-level  correlation  are  examined,  with 
examples  demonstrating  the  resulting  degradation  in  test  quality.  The  first  cause  of  bit- 
level  correlation,  register  adjacency,  is  a  by-product  of  the  ordering  of  the  registers 
within  the  circular  self-test  path.  The  second  cause  of  bit-level  correlation  stems  from 
the  shifting  nature  of  the  circular  self-test  path. 

The  work  described  in  this  chapter  goes  beyond  previous  work  in  a  number  of  ways. 
First,  the  concept  of  register  adjacency  is  generalized  to  arbitrary  distances.  Second,  a 
newly  discovered  source  of  bit-level  correlation  in  circuits  using  the  circular  self-test 
path  technique  is  described.  This  correlation  is  a  natural  consequence  of  the  shifting  of 
bits  in  the  circular  self-test  path;  its  presence  greatly  increases  the  probability  of  sys¬ 
tem  state  cycling,  which  can  have  a  devastating  effect  on  test  quality.  Guidelines  for 
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avoiding  this  source  of  correlation  by  placing  a  new  constraint  on  the  structure  of  the 
circular  self-test  path  are  presented. 

This  chapter  is  organized  as  follows.  Section  6.1  reviews  the  circular  self-test  path 
insertion  process,  which  is  the  motivation  for  our  work.  Section  6.2  covers  material 
preliminary  to  the  main  topic  of  the  chapter;  it  describes  how  the  system  state  transi¬ 
tion  graph  can  be  used  as  a  tool  in  understanding  system  state  cycling.  Section  6.3 
describes  a  structural  constraint  on  the  circular  self-test  path  arising  from  the  desire  to 
avoid  bit-level  correlation  caused  by  register  adjacency.  Section  6.4  describes  a  con¬ 
straint  arising  from  a  bit-level  correlation  that  occurs  because  of  the  shifting  nature  of 
the  circular  self-test  path.  Section  6.5  is  a  summary. 


6.1  Circular  Self-Test  Path  Insertion 

The  process  of  inserting  circular  self-test  path  features  into  a  register  transfer  level  cir¬ 
cuit  requires  two  main  steps.  The  first  is  flie  selection  of  registers  for  the  circular  self¬ 
test  path;  this  can  be  accomplished  in  a  number  of  ways,  ranging  from  selecting  a  max¬ 
imal  path  of  all  registers,  to  selecting  a  minimal  path  of  only  the  primary  input  and  pri¬ 
mary  output  registers.  Selection  of  registers  can  also  be  done  using  heuristics  based  on 
circuit  structure  and  testability  measures,  similar  to  what  [POLB88]  and  [LiZB93] 
have  done  at  the  gate  level.  In  Chapter  Eight,  we  will  do  BIST  insertion  at  the  register 
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transfer  level  based  on  testability  metrics,  computed  using  the  Markov  model 
described  in  Qiapter  Five. 

The  second  step  of  circular  self-test  path  insertion  is  the  formation  of  the  circular  self¬ 
test  path  from  the  selected  registers.  A  circular  self-test  path  consists  of  an  ordered  and 
oriented  set  of  circuit  registers.  An  algorithm  to  link  die  selected  registers  into  a  circu¬ 
lar  self-test  path  must  choose  both  the  ordering  for  the  registers  in  the  path,  and  the  ori¬ 
entation  or  shift  direction,  right  or  left,  for  each  register  in  the  path. 

Analysis  of  the  quality  of  the  circular  self-test  path  technique  as  a  testing  technique  so 
far  has  assumed  that  there  is  no  statistical  correlation  between  the  streams  of  bits 
applied  to  the  two  inputs  of  any  of  the  exclusive  OR  (XOR)  gates  in  the  circular  self¬ 
test  path  [PiKK92].  Figure  6- 1  shows  a  single  flip-flop  from  the  circular  self-test 
path.  The  assumption  is  that  bits  x  and  y  are  not  correlated.  Unfortunately,  for  some 
circular  self-test  path  structures,  this  assumption  does  not  hold;  bit-level  correlation 
occurs,  and  test  effectiveness  can  be  severely  affected.  It  is  important  to  identify  the 
structures  that  cause  bit-level  correlation  so  that  they  can  be  avoided  during  formation 
of  the  circular  self-test  path.  This  chapter  examines  two  causes  of  bit-level  correlation, 
and  the  limitations  that  must  be  placed  on  the  structure  of  (he  circular  self-test  path  to 
avoid  the  correlation.  Preliminary  to  the  discussion  of  these  constraints,  the  system 
state  transition  graph,  a  tool  for  understanding  the  problem  of  system  state  cycling,  is 


described. 
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Figure  6-1.  A  single  flip-flop  of  the  circular  self-test  path. 
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6.2  System  State  Cycling 

One  problem  that  has  plagued  the  circular  self-test  path  technique  is  that  of  system 
state  cycling,  or  limit  cycling.  The  state  of  a  system  is  determined  by  the  contents  of 
the  systan’s  memory  elements.  Following  the  definitions  of  Brynestad,  Aas,  and  Vall- 
estad  [BrAS90],  let  the  system  state  transition  graph  (STG)  of  a  circuit  be  a  directed 
graph,  where  the  nodes  represent  system  states  and  the  edges  represent  transitions 
between  system  states.  An  example  shape  of  an  STG  is  shown  in  Figure  6-2.  When  in 
test  mode,  the  system  as  a  whole  is  deterministic;  since  no  data  enter  the  circuit  via  the 
primary  inputs,  given  the  current  state,  there  is  one  unique  next  state.  This  means  that 
there  is  exactly  one  edge  leaving  each  node.  Note  that  there  may  be  more  than  one 
edge  entering  a  node.  In  general,  the  STG  is  partitioned  into  one  or  more  cycles  or 
rings,  which  may  have  tails,  or  paths  of  states  that  are  not  part  of  the  cycle  itself,  but 
eventually  lead  to  the  cycle.  The  first  node  of  each  tail  has  special  significance;  since 
such  a  node  has  no  edges  leading  into  it,  it  corresponds  to  a  state  that  can  not  be  gener- 
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Figure  6-2.  Example  shapes  for  the  STG. 

ated  naturally  during  a  test  session.  We  call  such  a  state  an  unreachable  state;  in 
[BrAS90],  these  states  are  called  leaves. 

Each  test  session  begins  by  putting  the  system  in  an  initial  state;  this  state  is  called  the 
system  seed.  This  corresponds  to  picking  one  node  of  the  STG.  Note  that  it  is  possible 
to  pick  an  unreachable  state  as  the  system  seed,  since  the  system  is  not  in  test  mode 
when  we  initialize  it.  Next,  the  system  is  clocked;  each  clock  corresponds  to  moving 
along  one  edge  of  the  STG  to  another  system  state.  If  the  system  is  clocked  enough 
times,  it  will  end  up  in  a  state  that  it  has  already  visited.  From  that  point  on,  aU  system 
states  generated  will  be  repeats  of  previously  generated  states;  on  the  STG,  a  path 
around  and  around  a  cycle  is  taken,  and  we  say  that  the  system  has  entered  a  cycle. 
Note  that  the  test  patterns  applied  to  a  given  combinational  logic  block  also  start 
cycling  at  this  point,  since  the  bits  in  a  test  pattern  are  a  subset  of  the  bits  in  the  system 
state.  As  a  result,  no  new  test  patterns  for  the  combinational  logic  blocks  are  gener- 
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ated.  Thus,  if  a  system  enters  a  cycle  early  in  a  test  session,  test  quality  will  be  low 
regardless  of  the  length  of  the  session. 

All  deterministic  systems,  including  circuits  utilizing  conventional  BIST  techniques, 
are  subject  to  system  state  cycling;  however,  circular  self -test  path  circuits  are  espe¬ 
cially  likely  to  enter  a  system  state  cycle  early  in  a  test  session.  The  shape  of  the  STG 
indicates  this  susceptibility;  ideally,  we  would  like  to  choose  our  test  session  so  that 
we  travel  along  a  long  path  in  the  STG  before  we  enter  a  cycle.  If  the  STG  is  com¬ 
posed  of  a  large  number  of  small  cycles,  this  may  not  be  possible.  Similarly,  if  the 
STG  has  a  large  number  of  tails,  we  may  not  be  able  to  avoid  cycling,  since  we  can 
choose  to  travel  along  at  most  one  of  the  tails.  By  looking  at  the  relationship  between 
certain  circular  self-test  path  structures  and  the  resulting  shape  of  the  STG,  we  gain 
insight  into  why  those  structures  can  result  in  poor  test  quality. 


6.3  Register  Adjacency 

One  cause  of  bit-level  correlation  is  unit-distance  register  adjacency  [Stro88] 
[POLB88].  This  phenomenon,  which  is  illustrated  in  Rgure  6-3,  occurs  when  flip-flop 
fj  comes  directly  after  flip-flop^-  in  the  circular  self- test  path,  and  is  also  fed  directly 
by  flip-flop^-  through  the  combinational  logic  of  the  datapath.  From  the  figure,  we  see 
that  unit-distance  register  adjacency  in  an  RTL  structure  can  be  thought  of  as  bit-level 
reconvergent  fanout;  the  output  of  flip-flop^  fans  out,  and  the  two  copies  of  the  bit 
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that  was  stored  in  fi  meet  again,  and  are  recombined,  before  being  stored  in  j^-  at  the 


next  clock.  One  copy  of  the  bit  takes  a  path  through  the  combinational  logic  of  the 


RTL  structure;  the  other  copy  travels  along  the  circular  self-test  path. 


Figure  6-3.  The  concept  of  register  adjacency. 


Figure  6-4  shows  a  simple  kind  of  register  adjacency.  Suppose  that  the  most  signifi¬ 
cant  bit  of  register  B  has  value  x.  On  the  next  clock,  this  bit  fans  out  to  both  inputs  of 
the  XOR  feeding  the  most  significant  bit  of  register  A;  this  correlation  of  inputs  means 
that  the  most  significant  bit  of  A  will  have  value  jc  ©  x  =  0 .  This  phenomenon  occurs 
at  each  time  step  of  a  test  session,  holding  the  most  significant  bit  of  A  constant  at 
value  0,  and  having  a  devastating  effect  on  the  randomness  of  register  A’s  state:  fully 
half  of  the  states  can  never  be  realized.  This  is  reflected  in  the  best  achievable  state 
coverage  for  register  A,  which  is  0.50.  In  terms  of  the  STG,  we  see  that  one  half  of  the 
system  states  (i.e.,  all  states  for  which  the  most  significant  bit  of  A  is  a  1)  are  unreach¬ 
able,  and  are  therefore  at  the  beginning  of  a  tail.  For  a  circuit  of  four-bit  registers, 
assuming  that  register  5’s  states  are  perfectly  pseudo-random  (the  best  possible  case). 


no 


Figure  6-4.  A  portion  of  an  RTL  structure  showing  unit-distance  register 

adjacency. 


register  B 

register  A 

randomness 

1.00 

0.75 

best  achievable 
state  coverage 

1.00 

0.50 

Table  6-1.  Metrics  for  the  registers  of  Figure  6-4. 


register  A’s  randomness  is  reduced  to  0.75,  reflecting  the  fact  that  while  three  of  the 
bits  of  A  are  pseudo-random,  one  bit  is  stuck  at  a  constant  value. 


Figure  6-5  is  another  example  of  register  adjacency.  As  in  the  last  example,  the  adja¬ 
cency  of  registers  B  and  A  causes  a  correlation  between  the  inputs  of  the  XOR  gate 
that  drives  the  most  significant  bit  of  A.  To  see  why,  let  the  most  significant  bits  of  reg¬ 
isters  B  and  D  have  values  y  andx,  respectively.  On  the  next  clock,  the  XOR  gate  will 
have  inputs  x  and  x&y,  for  a  result  of  x&.y .  At  the  same  time,  the  y  value  will  shift  into 
the  least  significant  bit  of  B.  This  means  that  there  is  a  correlation  between  the  most 
significant  bit  of  register  A  and  the  least  significant  bit  of  register  5;  namely,  if  the 
least  significant  bit  of  B  has  value  1 ,  the  most  significant  bit  of  A  must  have  value  0. 


Ill 


This  dependency  between  bits  of  the  system  state  holds  throughout  a  test  session, 
thereby  limiting  the  system  states  that  can  be  realized;  any  system  state  with  I’s  in 
both  the  least  significant  bit  of  B  and  the  most  significant  bit  of  A  is  unreachable.  A 
more  comprehensive  example,  in  Table  6-4  of  Section  6.4,  shows  that  the  presence  of 
so  many  tails  in  the  STG  gives  circuits  with  this  structure  a  tendency  to  cycle.  The 
dependency  also  degrades  the  randomness  of  register  A;  if  registers  B  and  D  are  per¬ 
fectly  random,  the  most  significant  bit  of  register  A  will  have  value  1  only  one-quarter 
of  the  time.  As  a  result,  when  registers  B  and  D  have  perfectly  random  states,  register 
A’s  randomness  is  decreased  to  a  value  of  0.94.  Although  it  may  still  be  possible  to 
generate  all  16  patterns  in  register  A,  doing  so  will  require  a  longer  test  session.  To 
illustrate  this,  consider  the  number  of  clocks  that  we  must  apply  to  the  circuit  to 
achieve  an  expected  state  coverage  of  99%  for  register  A,  given  that  system  state 
cycling  does  not  prevent  us  from  reaching  this  level  of  state  coverage  entirely.  As  the 
circuit  is  now,  we  must  apply  146  clocks;  if  register  A’s  randomness  were  l.O,  in  con¬ 
trast,  we  would  have  to  apply  only  72  clocks. 

Furthermore,  if  registers  A  and  B  serve  as  inputs  to  some  other  ALU  in  the  circuit,  the 
interdependency  between  the  states  of  the  two  registers  will  make  it  impossible  to 
exhaustively  test  that  ALU.  Specifically,  any  test  pattern  that  has  a  1  in  the  least  signif¬ 
icant  bit  of  B  and  a  1  in  the  most  significant  bit  of  A  can  not  be  generated. 

Unit-distance  register  adjacency  is  a  consequence  of  the  ordering  of  registers  in  the 
circular  self-test  path,  and  can  easily  be  identified.  We  first  construct  a  directed  graph 
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Figure  6-5.  A  second  example  of  distance- 1  register  adjacency. 


register  B 

register  D 

register  A 

randomness 

1.00 

1.00 

0.94 

1  best  achievable 

1  state  coverage 

1.00 

1.00 

1.00 

Table  6-2.  Metrics  for  the  registers  of  Figure  6-5. 


indicating  the  data  flow  from  memory  element  to  memory  element;  in  this  graph,  the 
nodes  correspond  to  the  flip-flops  of  the  circuit,  and  there  is  an  edge  from  node^-  to 
node  fj  if  and  only  if  flip-flop/'  feeds  directly  into  flip-flop/  through  combinational 
logic.  Next,  we  consider  the  flip-flops  in  the  circular  self-test  path,  two  by  two,  in 
order.  If  flip-flop/  directly  follows  flip-flop/  in  the  circular  self-test  path  and  there  is 
an  edge  from/  to/  in  the  graph,  we  have  exactly  the  situation  shown  in  Figure  6-3, 
and  there  is  register  adjacency  in  our  circuit.  Register  adjacency  can  usually  be 
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avoided  by  careful  ordering  of  registers  in  the  circular  self-test  path,  using  a  technique 
like  that  described  in  [Stro88].  In  the  event  that  no  ordering  of  registers  is  adjacency- 
free,  an  extra  flip-flop  can  be  inserted  into  the  path  to  break  up  the  adjacency.  While 
this  flip-flop  is  transparent  in  the  normal  mode,  in  test  mode  it  serves  to  delay  the  val¬ 
ues  traveling  around  the  circular  self-test  path  by  one  clock  [PiKK92].  An  alternative 
approach  for  avoiding  the  bit-level  correlation  caused  by  unit-distance  register  adja¬ 
cency  involves  modifying  the  design  of  the  test  register  [AvMc93].  Another  approach 
tests  whether  a  particular  instance  of  register  adjacency  has  a  significant  effect  on  test 
quality  before  restructuring  the  path  to  remove  it  [Gage93]. 

The  two  previous  examples  are  of  unit-distance  register  adjacency:  the  bit  value  that 
fans  out  reconverges  after  one  clock.  However,  register  adjacency  can  occur  at  arbi¬ 
trary  distances.  Distance-c?  register  adjacency  occurs  when  the  bit  value  that  fans  out 
reconverges  after  d  clocks.  Rgure  6-6shows  a  circuit  with  distance-2  adjacency.  For 
this  example,  registers  are  two  bits  wide.  Suppose  that  the  least  significant  bit  of  regis¬ 
ter  B  has  value  jc.  In  two  time  steps,  the  x  value  will  fan  out  and  travel  to  both  inputs  of 
the  XOR  that  drives  the  least  significant  bit  of  register  A;  one  copy  of  x  travels  along 
the  circular  self-test  path,  and  the  other  travels  through  the  datapath  (via  register  D). 
As  a  result,  the  least  significant  bit  of  register  A  will  have  value  x  ©  x  =  0 .  This  hap¬ 
pens  at  each  subsequent  clock,  after  an  initial  set-up  time  of  one  clock.  That  is,  this 
correlation  does  not  occur  the  very  first  time  the  circuit  is  clocked,  but  it  does  occur  at 
each  clock  after  that.  Once  again,  the  randomness  of  register  A  is  degraded. 
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i’  registers  1*^ 


Figure  6-6.  An  example  of  distance-2  register  adjacency. 


register  B 

register  D 

register  A 

randomness 

1.00 

1.00 

0.50 

best  achievable  state 
coverage 

1.00 

l.OO 

0.50 

Table  6-3.  Metrics  for  the  registers  of  Figure  6-6. 


An  algorithm  for  identifying  distance-^s?  register  adjacency  must  first  extend  the 
directed  graph  representing  data  flow  from  memory  element  to  memory  element  by 
annotating  the  edges  with  distances.  All  of  the  original  edges  have  distance  1.  If  in  the 
original  graph  there  is  a  path  from  node  i  to  node  j  traversing  d  edges,  an  edge  from  i  to 
j  with  distance  d  is  added  to  the  extended  graph.  Distance-rf  register  adjacency  exists 
between  flip-flops  and^-  if^-  comes  d  flip-flops  after in  the  circular  self-test  path 
and  there  is  a  distance-rf  edge  from  i  to  j  in  the  graph. 
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6.4  A  Shift- Related  Cause  of  Correlation 

Register  adjacency  is  not  the  only  phenomenon  that  causes  bit-level  correlation;  when 
an  arithmetic  logic  unit  in  the  RTL  structure  performs  a  bit-wise  operation,  correlation 
can  arise  as  a  natural  consequence  of  the  shifting  of  the  registers  in  the  circular  self¬ 
test  path.  Consider  the  circuit  fragment  of  Figure  6-7.  Suppose  that  the  least  significant 


Figure  6-7.  A  portion  of  an  RTL  structure  producing  bit-level  correlation  due  to 

shifting. 


bits  of  registers  B  and  D  have  values  x  and  y,  respectively.  Since  the  primary  inputs  of 
the  circuit  are  held  constant  at  zero  during  a  test  session,  registers  B  and  D  operate  as 
ordinary  shift  registers.  As  a  result,  the  jc  and  y  values  travel  to  the  left  in  lockstep  as 
the  circuit  is  clocked.  This  means  that  the  product  .x&y  is  formed  at  successive  outputs 
of  the  ALU.  Correlation  occurs  when  the  product  xScy  from  one  clock  meets  with  the 
product  from  the  previous  clock  and  cancels  out.  As  shown  in  Figure  6-7,  this  happens 
in  the  second  least  significant  bit  of  the  output  register  A.  The  input  to  the  XOR  gate 
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from  the  ALU  is  simply  the  product  x&y,  the  input  from  the  circular  self-test  path  is 
the  same  product,  formed  one  clock  cycle  earlier,  XORed  with  a  value  from  the  circu¬ 
lar  self-test  path,  (x&y)  ©  v .  The  product  cancels  out,  leaving  a  result  of  v.  A  similar 
correlation  occurs  in  every  second  bit  of  register  A. 

The  difficulty  with  this  type  of  correlation  is  that  register  A  no  longer  acts  like  a  multi¬ 
ple  input  shift  register;  the  cancellation  of  terms  causes  A  to  act  like  an  ordinary  shift 
register  from  the  point  of  view  of  the  circular  self-test  path.  In  other  words,  the  values 
shifted  out  of  A  onto  the  circular  self-test  path  are  simply  a  time-delayed  version  of  the 
bits  shifted  into  A.  This  may  affect  the  randomness  directly,  and,  more  importantly,  it 
makes  the  circuit  extremely  susceptible  to  system  state  cycling.  Note  that  when  the  cir¬ 
cuit  is  in  test  mode,  the  primary  input  registers  all  act  as  simple  shift  registers,  since  no 
inputs  are  presented  to  the  circuit.  The  more  registers  that  perform  simple  shifts 
instead  of  using  the  responses  of  the  circuit  to  find  the  next  state,  the  more  likely  the 
system  is  to  cycle. 

The  shifting  type  of  correlation  described  in  this  section  occurs  not  only  for  bit-wise 
operations  like  AND  and  OR,  but  also  for  operations  with  a  bit-wise  component.  For 
example,  this  same  phenomenon  occurs  for  the  ADD  function,  which  is  essentially  a 
bit-wise  XOR  with  some  carry  information  included.  While  the  cancellation  of  terms 
for  an  ADD  circuit  is  not  as  catastrophic  as  it  is  for  our  AND  example,  it  is  still  enough 
to  significantly  degrade  test  quality. 
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Unlike  the  register  adjacency  case,  this  cause  of  bit- level  correlation  cannot  be 
avoided  by  re-ordering  the  registers  in  the  circular  self-test  path;  for  our  example,  it 
does  not  matter  where  in  the  path  registers  B,  D  and  A  lie.  Instead,  we  must  pay  atten¬ 
tion  to  the  shift  direction  of  the  registers.  If  we  shift  input  registers  B  and  D  in  opposite 
directions,  the  jc  and  y  values  no  longer  travel  to  the  left  in  lockstep,  meeting  at  an 
AND  gate  at  each  time  step;  instead,  they  meet  only  once.  Similarly,  we  could  choose 
to  shift  the  output  register  in  a  direction  opposite  to  that  of  the  input  registers;  for  this 
configuration,  values  x  and  y  stUl  travel  in  lockstep,  but  their  product  from  one  time 
step  does  not  cancel  out  their  product  from  one  time  step  before.  Therefore,  when  con¬ 
structing  a  circular  self-test  path,  it  is  important  to  take  care  that  not  all  register  ports 
of  a  bitwise  operation  are  shifted  in  the  same  direction. 

Table  6-4  shows  the  relationship  between  the  choice  of  shift  directions  and  test  quality 
for  four-bit  AND  and  ADD  examples.  For  each  example,  three  circular  self-test  path 
configurations  are  used.  The  best  achievable  state  coverage  for  the  ALU  and  the  long¬ 
est  achievable  time  before  the  system  starts  cycling  are  shown;  these  are  found  via 
simulation,  using  the  initial  system  state  or  seed  that  gives  the  most  favorable  results. 
In  the  first  configuration,  all  three  registers  are  shifted  in  the  same  direction;  this  cir¬ 
cuit  suffers  from  the  shifting  correlation  described  above.  Note  that  the  AND  example 
enters  a  system  state  cycle  after  only  twelve  clocks.  The  second  configuration  avoids 
the  shifting  problem,  but  introduces  register  adjacency.  Here,  we  see  that  a  large  num¬ 
ber  of  unreachable  states,  and  therefore  a  large  number  of  tails  in  the  STG,  result  in 


118 


circuits  that  cycle  badly;  the  AND  example  enters  a  system  state  cycle  after  36  clocks, 
and  the  ADD  example  after  185  clocks.  The  third  configuration  shows  that  the  shift 
directions  can  in  fact  be  chosen  to  avoid  both  sources  of  bit-level  correlation.  The 
same  kind  of  shifting  correlation  that  occurs  in  the  first  configuration  will  also  occur 
when  this  small  example  is  embedded  in  a  larger  circuit. 
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Table  6-4.  Shift  direction  and  its  impact  on  test  quality. 


Figure  6-8  shows  shifting  correlation  within  a  larger  circuit.  For  this  example,  the  reg¬ 
isters  are  four  bits  wide.  As  in  the  smaller  example,  shifting  correlation  causes  the  out¬ 
put  register  to  act  as  a  shift  register  from  the  point  of  view  of  the  circular  self-test  path: 
bit-level  correlation  in  the  least  significant  bit  and  the  third  least  significant  bit  of  the 
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Figure  6-8.  A  second  example  of  bit-level  correlation  due  to  shifting. 


register  ensure  that  the  values  shifted  out  of  H  onto  the  circular  self- test  path  are  sim¬ 
ply  a  time-delayed  version  of  the  bits  shifted  into  H.  Note  that  for  this  circuit,  then,  all 
registers  in  the  path  are  essentially  shift  registers.  There  are  20  flip-flops  in  the  path;  as 
a  result,  the  state  of  the  path  will  cycle  in  20  clocks.  This  makes  it  extremely  likely  that 
the  entire  system  will  also  cycle  in  20  clocks,  making  this  circuit  impossible  to  test 
adequately. 

Identifying  shifting  correlation  in  a  circuit  using  the  circular  self-test  path  technique 
involves  looking  for  sections  of  the  circuit  that  perform  bitwise,  or  largely  bitwise. 
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operations.  When  such  a  section  exists,  shifting  correlation  will  occur  if  all  input  and 
output  registers  of  the  section  are  shifted  in  the  same  direction.  Therefore,  care  should 
be  taken  to  choose  a  set  of  register  orientations  such  that  at  least  one  of  the  input  and 
output  registers  of  the  section  is  shifted  in  a  direction  opposite  to  that  of  the  other  reg¬ 
isters. 


6.5  Summary 

Two  sources  of  bit-level  correlation  in  RTL  circuits  using  the  circular  self-test  path 
technique  have  been  described,  along  with  examples  demonstrating  the  degradation  in 
test  quality  resulting  from  the  presence  of  the  correlation.  In  avoiding  these  two 
sources  of  correlation,  constraints  are  placed  on  the  structure  of  the  circular  self-test 
path.  The  furst  source  of  correlation,  register  adjacency,  can  be  avoided  through  careful 
choice  of  the  ordering  of  the  registers  within  the  path.  The  second  source  of  correla¬ 
tion,  which  arises  from  the  shifting  nature  of  the  circular  self-test  path,  can  be  avoided 
through  careful  choice  of  a  shift  direction,  right  or  left,  for  each  register  in  the  path.  A 
clear  understanding  of  these  sources  of  correlation,  and  the  resulting  effect  in  terms  of 
both  test  patterns  that  can  not  be  generated  and  the  possibility  of  system  state  cycling, 
is  the  first  step  in  automating  the  circular  self-test  path  formation  process. 

Although  the  work  of  this  chapter  was  done  with  the  circular  self-test  path  methodol¬ 
ogy  in  mind,  much  of  it  is  also  applicable  to  the  circular  BIST  methodology.  As  was 
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reviewed  in  Chapter  Two,  the  two  methodologies  are  similar;  both  create  a  chain  out 
of  the  same  kind  of  test  register.  Register  adjacency  can  occur  in  circular  BIST  cir¬ 
cuits,  and  should  be  avoided  using  the  techniques  outlined  in  this  chapter.  The  shifting 
correlation  described  in  this  chapter  can  also  occur  in  circular  BIST  circuits.  However, 
in  the  case  of  circular  BIST,  system  state  cycling  does  not  become  overwhelmingly 
likely  when  shifting  correlation  is  present.  The  reason  for  this  difference  is  that  the  cir¬ 
cular  BIST  technique  uses  a  test  pattern  generation  register  (TPGR)  to  drive  the  input 
end  of  the  chain,  and  a  multiple  input  shift  register  (MISR)  to  watch  the  output  end  of 
the  chain,  rather  than  simply  recirculating  the  system  bits  the  way  the  circular  self-test 


path  methodology  does. 


Chapter  7 


BIST  Insertion  at  the  Logic 
Level 


In  industry  today,  design  and  test  are  often  considered  separate  problems;  design  of  a 
circuit  is  done  without  regard  to  testability,  and  BIST  features  are  added  to  the  design 
after  the  fact.  The  test  quality  of  circuits  designed  this  way  can  be  quite  poor;  often, 
the  underlying  structure  of  the  circuit  makes  it  difficult  to  test  using  pseudorandom  test 
patterns.  When  faced  with  such  a  random  pattern  resistant  design,  the  circuit  designer 
has  two  options.  The  fu^st,  to  re-synthesize  the  circuit  from  scratch,  this  time  keeping 
test  in  mind,  is  often  undesirable.  In  some  cases,  well  established  circuits,  known  to 
work  well,  have  been  described  informally,  e.g.,  with  block  diagrams.  Re-synthesis 
would  require  describing  the  function  of  such  a  circuit  formally  in  the  behavioral 
domain;  however,  the  behavior  may  not  be  immediately  known,  or  it  may  be  difficult 
to  capture  the  exact  behavior  in  a  formal  description  language  such  as  VHDL.  For 
these  reasons,  re-synthesis  may  involve  a  degree  of  risk  that  designers  are  unwilling  to 
incur. 

The  second  option,  which  is  the  only  alternative  when  re-synthesis  of  a  design  can  not 
be  done,  is  to  try  to  enhance  the  testability  of  the  existing  design.  This  second  option  is 


122 


123 


the  one  explored  in  this  chapter.  The  goal  of  this  work  is  to  present  a  systematic 
approach  for  testability  enhancement  of  existing  logic  level  designs  utilizing  pseudo¬ 
random  BIST.  The  approach  is  to  remove  the  resistance  of  the  circuit  to  random  pat¬ 
terns  in  test  mode  while  changing  as  little  of  the  design  as  possible.  A  testability 
metric,  entropy,  is  used  to  pinpoint  problem  areas  of  the  design;  entropy  is  calculated 
using  probabilistic  analysis  based  on  the  Markov  model  presented  in  Chapter  Five. 
Signals  with  low  entropy  can  make  testing  difficult  by  affecting  both  controllability 
and  observability  of  internal  parts  of  the  circuit. 

This  chapter  next  describes  a  methodology  for  built-in  self-test  (BIST)  insertion  in 
logic  level  circuits.  The  methodology  is  demonstrated  on  a  submodule  of  an  industrial 
logic  level  design.  Fault  simulation  is  used  to  show  that  the  resulting  modified  circuit 
designs  are  indeed  more  easily  tested  than  the  original. 


7.1  Testability  Enhancement  Technique 

This  section  describes  our  testability  enhancement  procedure.  We  assume  the  test  par¬ 
adigm  of  Figure  7-1,  i.e.,  we  assume  diat  in  test  mode  the  overall  circuit  is  configured 


Figure  7-1.  The  basic  BIST  scheme  used. 
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SO  that  the  inputs  of  a  sequential  module  are  driven  by  a  test  pattern  generation  register 
(TPGR),  and  the  outputs  of  the  module  drive  a  multiple  input  shift  register  (MISR).  In 
this  way,  we  can  focus  on  the  sequential  circuit  under  test  (CUT),  assuming  that  is 
driven  by  random  patterns  of  known  distribution.  Essentially,  we  will  be  looking 
within  a  circuit  for  lines  with  1 -probabilities  that  are  very  close  to  either  0  or  1.  Alter¬ 
natively,  we  may  look  for  lines  of  low  entropy,  as  defined  in  Equation  4- 1  on  page  47. 
The  1 -probabilities  and  entropies  for  various  points  in  the  circuit  are  calculated  using 
the  Markov  model  described  in  Chapter  Five. 

We  will  take  two  different  approaches  in  dealing  with  problem  areas  in  a  design.  The 
first  approach  involves  the  use  of  the  circular  BIST  methodology  within  the  sequential 
CUT.  For  this  approach,  the  overall  BIST  scheme  of  Figure  7- 1  is  preserved,  but  some 
of  the  flip-flops  internal  to  the  CUT  are  used  to  form  a  circular  BIST  chain.  The  sec¬ 
ond  approach  involves  the  use  of  multiplexers  to  insert  test  points  that  alter  the  func¬ 
tioning  of  the  circuit  during  test;  a  multiplexer,  controlled  by  a  signal  that  indicates 
whether  the  circuit  is  in  normal  or  test  mode,  may  serve,  for  example,  to  deliver  high 
quality  test  patterns  to  a  portion  of  the  circuit  that  is  not  ordinarily  highly  controllable. 

Our  testability  enhancement  methodology,  shown  in  Rgure  7-2,  is  a  two  step  tech¬ 
nique.  The  first  step  involves  the  testability  enhancement  for  flip-flops  that  have  low 
entropy.  Once  the  flip-flop  entropies  have  been  calculated,  testability  problems  may 
become  apparent;  since  the  flip-flops  drive  the  combinational  logic,  low  entropy  of  the 
flip-flops  translates  directly  into  the  application  of  poor  quality  tests  to  the  gates  of  the 
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Algorithm  Testability  Enhancement  \ 

\ 

Input:  A  sequential  circuit  design.  | 

Output:  The  circuit  design,  modified  to  enhance  testability.  j 

I 

Step  1.  Use  the  Markov  model  to  calculate  flip-flop  entropies.  | 

For  those  flip-flops  with  low  entropy,  do  testability  insertion  to 
enhance  the  entropy. 

If  fault  coverage  of  resulting  design  is  satisfactory,  stop  here. 

Step  2.  Within  the  combinational  logic  segments,  use  the  Parker-McCluskey 
algorithm  to  find  entropies  of  internal  nodes. 

Select  the  line  nearest  the  primary  inputs  that  has  low  entropy. 
Enhance  the  entropy  of  that  line  by  doing  testability  insertion. 
Recompute  the  internal  node  entropies. 

If  some  internal  lines  still  have  low  entropy,  repeat  the  selection  and 
enhancement  part  of  this  step. 

i 

i _ I 

Figure  7-2.  Overview  of  testability  enhancement  technique. 


circuit,  and  thus  into  low  fault  coverage  for  the  combinational  logic.  Therefore,  if  any 
flip-flop  in  the  CUT  has  low  entropy,  the  situation  should  be  remedied.  For  circuits 
using  circular  BIST,  this  may  be  accomplished  by  replacing  the  low  entropy  flip-flop 
with  a  test  flip-flop  that  in  test  mode  is  part  of  the  circular  BIST  chain.  For  circuits 
using  multiplexer-based  test  point  insertion,  flip-flop  entropy  can  often  be  raised  by 
breaking  loops.  When  Step  1  is  complete,  all  the  combinational  logic  of  the  sequential 
CUT  is  driven  with  high  quality  test  patterns  when  in  test  mode. 


If  test  quality  of  the  circuit  is  still  poor,  it  is  necessary  to  take  a  closer  look  at  the  com¬ 
binational  logic  segments,  and  to  alter  the  structure  of  the  combinational  logic  to 
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remove  the  random  pattern  resistance  [BaMS87]  that  can  make  it  difficult  to  test  the 
combinational  logic  using  random  patterns.  This  is  done  in  the  second  step  of  the  test¬ 
ability  enhancement  methodology,  which  involves  looking  at  the  entropies  of  nodes 
internal  to  the  combinational  logic  segments.  We  apply  the  Parker-McCluskey  algo¬ 
rithm  [PaMc75]  to  the  combinational  logic  segments  to  find  signal  lines  of  low 
entropy.  Again,  there  are  a  number  of  ways  to  improve  the  entropy  of  individual  signal 
lines;  we  may  choose  to  insert  an  additional  circular  BIST  flip-flop  that  is  transparent 
in  normal  mode,  or  to  use  a  multiplexer  to  insert  a  test  point. 


7.2  Example  Application 

This  section  describes  the  circuit  that  will  be  used  as  an  example  to  provide  the  details 
of  the  testability  enhancement  technique.  The  circuit,  shown  in  Figure  7-3,  is  a  sub- 
module  of  an  existing  design  provided  by  Rockwell  International.  The  overall  chip 
was  designed  by  Rockwell  engineers  as  a  large  number  of  submodules,  each  of  which 
is  a  sequential  circuit.  The  submodules  are  quite  small,  with  typically  about  ten  inter¬ 
nal  flip-flops.  The  submodules  are  interconnected  via  intervening  registers. 

As  a  first  step,  conventional  BIST  was  added  to  the  overall  chip  in  the  following  sim¬ 
ple  way;  the  registers  of  the  chip  were  replaced  with  special  test  registers  that  can  act 
as  test  pattern  generators  (TPGRs)  and  multiple  input  shift  registers  (MISRs).  Thus, 
we  test  each  submodule  separately,  using  the  paradigm  of  Figure  7-1.  Our  goal  is  to 
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Start-5 

Figure  7-3.  A  submodule  of  an  industrial  design. 


modify  the  submodules,  if  necessary,  to  make  them  random  pattern  testable.  We  dem¬ 
onstrate  this  by  applying  the  testability  enhancement  technique  of  Figure  7-2  to  the 
example  submodule. 

We  consider  two  distinct  sets  of  experiments.  The  first  uses  circular  BIST  within  the 
sequential  circuit,  and  the  second  employs  multiplexer-based  test  point  insertion.  For 
both  sets  of  experiments,  we  compare  fault  coverage,  area,  and  critical  delay  for  the 
original  and  enhanced  variations  of  the  circuit.  Fault  coverage  data  are  obtained  using 
AT&T’s  GENTEST  fault  simulator  [Davi94];  area  and  performance  figures  come 
from  the  COMPASS  Design  Automation  suite  of  tools  [CODA92].  Areas  are  found  by 
performing  layout  synthesis  using  standard  cells.  Performance  is  measured  in  terms  of 
the  critical  delay  for  the  circuit,  and  comes  from  a  timing  verifier  in  the  COMPASS 
suite,  using  circuit  models  that  include  interconnect  delay.  Area  and  critical  delay  fig- 
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ures  are  for  the  sequential  CUT  only,  and  do  not  include  the  TPGR  and  MISR.  Over¬ 
head  for  a  circuit  is  calculated  with  the  following  formula: 

,  ,  ^  ~  ^baseline 

overhead  =  - , 

^baseline 

where  v  is  the  value  for  die  circuit  in  question,  and  vtaseHne  corresponding  value 
for  a  baseline  circuit. 

Regardless  of  which  methodology  is  used  for  testability  enhancement,  the  first  step  of 
the  testability  enhancement  methodology  requires  calculation  of  all  flip-flop  1-proba¬ 
bilities  and  entropies.  Before  this  can  be  done,  the  combinational  logic  of  the  circuit 
must  be  divided  into  segments  to  create  an  “RTL”  view  of  the  circuit.  The  need  for  this 
step  was  described  in  Section  5.4  of  Chapter  Five,  which  describes  the  1-probability 
computation.  The  next  subsection  describes  the  segmentation  of  the  combinational 
logic  for  this  example.  A  second  subsection  presents  flip-flop  entropies  for  the  original 
circuit,  and  describes  why  the  low  entropies  adversely  affect  testability. 

7.2.1  Segmentation  of  combinational  logic 

In  order  to  create  an  “RTL”  view  of  the  circuit,  we  group  the  gates  of  the  submodule 
into  combinational  logic  segments  driving  each  flip-flop  and  primary  output.  This  seg¬ 
mentation  of  the  circuit  is  shown  in  Figure  7-4,  in  which  the  rectangular  boxes  are 
multiplexers  and  the  thick  lines  are  multiple  signal  lines.  There  are  two  important 
aspects  to  note  about  the  segmentation: 
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(b)  Combinational  logic  driving  JKPTO  (signal  YSCSTR). 

an 


Sc2t3] 1 
old_yacatr  < 


-Start 


'_stlock 


A 


_yscstr 


B 


CL  Segment  C 

(c)  Combinational  logic  driving  primary  output  START. 

Figure  7-4.  Combinational  logic  of  circuit,  grouped  into  segments. 


130 


•  Since  our  Markov  Chain  model  handles  only  simple  D  flip-flops,  we  must 
model  any  JK  flip-flops  in  terms  of  D  flip-flops,  combinational  logic,  and 
feedback,  as  shown  in  Figure  5-16  on  page  93.  Note  that  this  change  is  for 
the  sake  of  analysis  only;  it  is  not  necessary  to  replace  the  JK  flip-flop  in 
the  physical  circuit. 

•  Since  a  gate  may  drive  more  than  one  flip-flop  or  primary  output  (by  fan¬ 
ning  out),  the  combinational  logic  segments  may  overlap.  For  example,  OR 
gate  G8  is  in  both  CL  segment  A  and  B. 

The  result  of  segmentation  is  a  register  transfer  level  view  of  the  sequential  CUT;  this 
view  is  shown  in  Figure  7-5. 


primary  inputs  (from  TPGR) 


primary  output  START 
(to  MISR) 


Figure  7-5.  RTL  view  of  the  example  circuit. 
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7.2.2  Analysis  of  flip-flop  entropies 

It  is  to  the  “RTL”  view  that  we  apply  the  Markov  model  to  compute  the  flip-flop  1- 
probabilities  and  entropies.  Results  of  the  probabilistic  analysis  (which  required  two 
iterations)  are  shown  in  Table  7-1.  These  results  bring  two  serious  testability  problems 


signal  name 

1-prob. 

entropy 

SR  [7]  to  SR  [0] 

0.5 

1 

STLOCK 

1 

0 

YSCSTR 

0 

0 

START 

0.5 

1 

Table  7-1.  Rip-flop  and  primary  output  1-probabilities  and  entropies  for  the 

original  circuit. 

to  light:  both  STLOCK  and  YSCSTR  have  entropies  of  zero  in  the  steady-state.  From 
Figure  7-6,  which  is  a  fragment  of  the  original  circuit  affected  by  the  STLOCK  signal, 
it  is  easy  to  see  that  once  the  STLOCK  signal  goes  high,  feedback  ensures  that  it  stays 
high.  Therefore,  in  the  steady  state  STLCXZK  has  a  1-probability  of  one,  and  an 
entropy  (using  Equation  4- 1  on  page  47)  of  zero.  A  similar  problem  ensuring  that  the 
YSCSTR  signal  stays  low  is  less  immediately  apparent  from  the  circuit  structure. 
These  low  entropies  cause  tremendous  testability  difficulties;  when  the  STLOCK  sig¬ 
nal  is  high,  it  serves  to  block  passage  of  values  through  NOR  gate  G4,  and  in  turn 
through  AND  gates  G5,  G6,  and  G7  (see  Figure  7-6).  Thus,  any  fault  effects  observ¬ 
able  at  the  inputs  of  these  gates  will  not  be  observable  at  the  primary  output  of  the  cir¬ 
cuit.  Thus,  the  STLOCK  signal,  when  high,  effectively  cuts  off  observability  of  a  large 
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Figure  7-6.  Feedback  of  the  STLOCK  signal  hampers  testability. 

section  of  the  circuit.  The  YSCSTR  signal  causes  similar  problems  with  AND  gates 
G6,  G7,  and  GW. 

A  first  testability  enhancement  to  the  circuit  should  ensure  that  signals  STLOCK  and 
YSCSTR  do  not  remain  constant.  The  entropy  of  these  signals  can  be  increased  in  a 
number  of  ways.  We  will  explore  two  ways  in  this  chapter.  Using  the  circular  BIST 
methodology,  we  will  replace  the  low  entropy  flip-flops  with  test  flip-flops  that  are 
included  in  the  circular  BIST  chain.  In  another  experiment,  we  will  use  multiplexers  to 
break  feedback  loops  when  the  circuit  is  in  test  mode;  feedback  is  a  major  cause  of  low 
entropy.  A  third  method  for  enhancing  flip-flop  entropy  is  to  use  a  partial  scan  chain 
consisting  of  the  flip-flops  with  low  entropy  so  that  the  states  of  these  flip-flops  can  be 
controlled  directly  during  the  test  session  [LiZB93]. 


133 


7.3  Circular  BIST-Based  Test  Point  Insertion 

In  this  subsection,  experiments  that  use  the  circular  BIST  methodology  to  enhance 
testability  are  described.  The  basic  scheme  used  is  illustrated  in  Figure  7-7.  Selected 
flip-flops  of  the  sequential  CUT  are  replaced  with  circular  BIST  test  flip-flops.  During 
normal  mode  operation,  these  test  flip-flops  act  like  regular  flip-flops,  but  in  test  mode 
they  form  a  circular  BIST  chain.  The  first  flip-flop  in  the  chain  is  driven  with  random 
patterns  from  one  output  of  the  TPGR,  and  the  last  flip-flop  in  the  chain  drives  one 
input  of  the  MISR. 


Figure  7-7.  The  circular  BIST  scheme. 


We  use  the  original  circuit  as  a  baseline  circuit  for  the  experiments  described  in  this 
section.  As  a  first  enhancement  (following  Step  1  of  Rgure  7-2),  flip-flop  entropy  is 
improved  by  including  flip-flops  with  low  entropy  in  the  circular  BIST  chain.  Thus, 
flip-flops  DFF8  and  JKFFO  are  replaced  with  test  flip-flops  and  configured  into  a 
chain.  A  dummy  flip-flop  was  required  between  DFF8  and  JKFFO  to  prevent  register 
adjacency,  as  described  in  Chapter  Six.  Table  7-2  compares  flip-flop  entropy  for  the 
original  circuit,  which  we  denote  Version  /,  and  the  circuit  variation  with  increased 
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signal  name 

Version  I 

Version  II 

1-prob. 

entropy 

1-prob.  1  entropy 

SR  [7]  to  SR  [0] 

0.5 

1 

0.5 

1 

STLOCK 

1 

0 

0.5 

1 

YSCSTR 

0 

0 

0.5 

1 

START 

0.5 

1 

i _ 

0.5 

1 

Table  7-2.  Flip-flop  and  primary  output  1-probabilities  and  entropies  for  the 
original  circuit  and  the  first  circular  BIST-based  enhancement. 


flip-flop  entropy,  which  we  denote  Version  II.  From  the  table,  we  see  that  in  Version  II 
the  flip-flops  have  very  high  entropy.  High  entropies  ensure  that  good  quality  test  pat¬ 
terns  are  delivered  to  each  combinational  logic  segment. 


The  lower  two  curves  of  Figure  7-8  compare  fault  coverage  for  the  original  circuit  and 
the  first  circular  BIST  variation.  Version  II.  From  Figure  7-8,  we  see  that  while 
enhancement  of  flip-flop  entropy  has  improved  the  fault  coverage  for  the  example  cir¬ 
cuit,  fault  coverage  is  still  not  high.  This  is  the  case  when  the  combinational  logic  seg- 
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Figure  7-8.  Fault  coverage  curves  for  the  circular  BIST-based  circuits. 
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merits  are  random  pattern  resistant.  One  good  indication  of  random  pattern  resistance 
is  low  entropy  on  one  or  more  internal  lines  of  the  circuit;  if  a  line  has  low  entropy,  one 
of  the  logic  values  is  difficult  to  generate  on  that  line.  As  a  result,  any  gates  that  are 
driven  by  that  line  may  be  difficult  to  test,  especially  if  detecting  some  fault  near  that 
gate  involves  controlling  the  line  to  the  difficult- to-obtain  value. 

The  second  step  of  the  testability  enhancement  technique  of  Rgure  7-2  obtains  the  1- 
probabilities  and  entropies  of  internal  lines  of  the  circuit  by  applying  the  Parker- 
McCluskey  algorithm  to  the  combinational  logic  segments  of  Figure  7-4.  Results  for 
Version  //are  shown  in  the  first  two  columns  of  Table  7-3.  These  results  show  that 
even  though  we  are  supplying  each  combinational  logic  segment  with  uniformly  dis¬ 
tributed  pseudorandom  patterns,  the  entropy  of  internal  lines  is  low.  As  shown  in  the 
table,  the  output  of  G4  has  an  entropy  of  only  0. 1 1615;  this  also  causes  low  entropy  in 
parts  of  the  circuit  driven  by  G4  (G5,  G6,  G7,  etc.).  We  can  obtain  a  value  of  1  at  the 
output  of  G4  only  approximately  l/64th  of  the  time.  Since  controlling  this  node  to 
value  1  is  necessary  for  fully  testing  gates  G5,  G6  and  G7,  fully  testing  the  circuit  may 
require  using  an  unacceptably  long  test  session.  This  is  reflected  in  the  fault  coverage 
curve  for  Version  11;  the  curve  rises  slowly  instead  of  reaching  a  knee  and  leveling  off 
suddenly. 

For  our  circular  BIST  experiment,  we  will  improve  the  entropy  at  the  output  of  G4  by 
inserting  a  test  flip-flop  at  this  node,  so  that  during  test,  gates  G5,  G6,  and  G7  are  not 
driven  by  G4,  but  are  instead  driven  directly  by  the  circular  BIST  chain.  The  flip-flop. 
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Tab 


gate  or  node 

Version  II 

Version  III 

1-prob. 

entropy 

1-prob. 

entropy 

MUXO  to 
MUX3 

0.5 

1 

0.5 

1 

GO  to  Gi 

0.5000 

1 

0.5000 

1 

G4  (observed) 

0.1563 

0.1162 

0.1563 

0.1162 

G4  (fanouts) 

— 

— 

0.5000 

1 

G5 

0.003906 

0.03688 

0.1250 

0.5436 

G6 

0.001953 

0.02039 

0.0625 

0.3373 

G7 

0.001953 

0.02039 

0.0625 

0.3373 

G8 

0.2529 

0.8159 

0.3125 

0.8960 

G9 

0.2500 

0.8113 

0.25 

0.8113 

GIO 

0.1279 

0.5517 

0.1563 

0.6253 

Gll 

0.5640 

0.9882 

0.5781 

0.9823 

STLOCK 

0.5000 

1 

0.5000 

1 

GI2  (START) 

0.5000 

1 

0.5000 

1 

JK  function 

0.3760 

0.9552 

0.3906 

0.9652 

YSCSTR 

0.5000 

1 

0.5000 

1 

e  7-3.  Internal  node  1-probabilities  and  entropies  for  the  circular  BIS’ 


-based 


circuit  variations. 


which  is  transparent  in  normal  mode,  ensures  randomness  at  the  trouble  node  during 


test.  This  variation  of  the  circuit  is  denoted  Version  III.  Table  7-3  shows  that  internal 


line  entropies  for  Version  HI  are  substantially  higher  than  for  Version  II.  Figure  7-8 
shows  that  Version  III  has  significantly  higher  fault  coverage  than  Version  I  and  Ver¬ 
sion  ir,  note  that  Version  ///also  reaches  high  fault  coverage  much  more  quickly  than 


Version  II. 


The  majority  of  the  faults  not  detected  in  Version  III  of  the  circuit  are  on  the  asynchro¬ 
nous  reset  lines  of  the  flip-flops;  since  the  reset  lines  are  exercised  only  once  at  the 


137 


beginning  of  the  test,  it  is  difficult  to  tell  whether  the  resets  are  working  correctly.  In 
particular,  we  may  not  notice  if  a  reset  is  stuck  in  its  inactive  state.  In  the  next  set  of 
experiments,  which  are  run  on  exactly  the  same  three  circuits  as  the  previous  set  of 
experiments,  we  remedy  the  situation  by  exercising  the  reset  lines  during  the  test.  We 
do  so  in  a  pseudorandom  fashion;  however,  we  weight  the  pseudorandom  pattern  on 
the  reset  line  so  that  resetting  is  done  infrequently.  By  resetting  infrequently,  we  can 
catch  faults  on  the  reset  lines  without  disrupting  the  rest  of  the  test  too  severely. 

Figure  7-9  shows  fault  coverage  curves  for  the  original  circuit  and  the  two  circular 
BIST  variations,  this  time  using  pseudorandom  asynchronous  resets,  weighted  so  that 
the  probability  of  doing  a  reset  at  any  given  clock  is  one-sixteenth.  It  is  important  to 
note  that  this  use  of  the  asynchronous  reset  invalidates  our  entropy  analysis;  the  flip- 
flop  is  artificially  brought  to  the  zero  state,  and  so  the  probability  that  the  flip-flop  is  a 
zero  is  higher.  In  addition,  the  Markov  analysis  works  with  probability  distributions  in 
the  steady-state;  and  therefore  does  not  apply  directly  after  a  reset. 

In  the  case  of  the  original  circuit,  the  pseudorandom  reset  does  more  than  allow  us  to 
catch  faults  on  the  reset  lines;  it  also  improves  the  testability  of  the  rest  of  the  circuit.  It 
does  this  by  kicking  the  circuit  out  of  the  “locked-up”  state  that  caused  the  entropies  to 
be  so  low  (recall  Figure  7-6).  This  is  enough  to  bring  the  achievable  fault  coverage  for 
the  original  circuit  up  from  41%  to  72%;  actually,  using  a  pseudorandom  reset  is 
almost  as  good  as  replacing  JKFFO  and  DFF8  with  circular  BIST  flip-flops.  If  a  circuit 
already  uses  flip-flops  with  reset  capability,  this  can  be  an  area-effective  method  for 
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Figure  7-9.  Fault  coverage  curves  for  the  circular  BIST-based  circuits,  using 
pseudorandom  asynchronous  resets. 


circuit 

Area  in 

Transistor 

count 

Critical  delay 
in  ns 

Version  / 

730  X  576.5 

530 

10.80 

Version  11 

874  X  568.5 

635 

13.44 

Version  III 

954  X  568.5 

723 

14.84 

overhead  for  Version  II 

18.1% 

19.8% 

24.4% 

overhead  for  Version  III 

28.9% 

36.4% 

37.4% 

Table  7-4.  Area  and  performance  figures  for  the  circular  BIST-based  circuit 

variations. 


raising  flip-flop  entropy.  However,  using  a  pseudorandom  reset  is  not  enough  by  itself 
to  attain  high  fault  coverage  because  of  the  low  entropy  of  internal  lines. 


Table  7-4  shows  area  and  performance  characteristics  for  the  three  circular  BIST- 
based  circuit  variations.  Area  overhead  is  high,  especially  for  Version  III.  A  substantial 
part  of  the  problem  here  is  that  we  must  insert  dummy  flip-flops  to  break  up  register 
adjacency  in  the  circular  BIST  chain;  this  would  not  have  been  necessary  if  the  flip- 
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flops  had  not  been  so  interdependent  on  one  another.  The  inserted  test  circuitry  also 
has  a  detrimental  effect  on  the  critical  delay  of  the  circuit. 


7.4  Multiplexer-Based  Test  Point  Insertion 

Instead  of  using  circular  BIST,  it  is  possible  to  use  multiplexer-based  test  point  inser¬ 
tion  for  testability  enhancement.  Here,  multiplexers  are  used  to  deliver  high  quality 
test  patterns  to  internal  parts  of  the  circuit  that  would  not  ordinarily  be  controllable; 
the  multiplexers  are  controlled  by  a  signal  that  indicates  whether  the  circuit  is  in  nor¬ 
mal  or  test  mode. 

In  this  section,  experiments  that  use  the  multiplexer-based  test  point  insertion  to 
enhance  testability  are  described.  Since  the  design  engineers  suspected  that  observ¬ 
ability  is  a  major  problem  with  the  submodule  of  Figure  7-3,  we  start  by  taking  several 
extra  observation  points  out  from  internal  parts  of  the  circuit  to  the  MISR.  We  use  this 
submodule  with  added  observability  as  the  baseline  circuit  for  the  experiments  in  this 
section,  and  refer  to  it  as  Version  I.  For  the  circular  BIST-based  experiments  described 
in  the  previous  subsection,  there  was  no  need  to  explicitly  add  extra  observability 
points  to  the  circuit,  since  the  flip-flops  in  the  circular  BIST  chain  provided  observable 
points  internal  to  the  circuit. 

As  a  first  enhancement,  flip-flop  entropy  is  improved  by  breaking  feedback  loops  dur¬ 
ing  test  mode.  Table  7-2  compares  flip-flop  entropy  for  the  baseline  circuit.  Version  I, 
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and  the  circuit  variation  with  increased  flip-flop  entropy.  The  variation,  which  is 
denoted  Version  II,  uses  multiplexers  to  break  all  of  the  feedback  loops  in  the  circuit; 
note  that  since  some  of  the  feedback  was  inherent  in  the  JK  flip-flop,  this  required 
replacing  the  JK  flip-flop  with  a  D  flip-flop  and  appropriate  combinational  logic.  Dur¬ 
ing  test,  lines  that  are  ordinarily  driven  by  feedback  are  driven  directly  by  the  TPGR 
instead.  From  Table  7-2,  we  see  that  the  variation  has  improved  entropy.  The  higher 
the  entropy  of  the  inputs  to  a  combinational  logic  segment,  the  better  the  quality  of  test 
patterns  applied  to  that  segment. 


signal  name 

Version  I 

Version  II 

1-prob. 

entropy 

1-prob. 

entropy 

SR  [7]  to  SR  [0] 

0.5 

1 

0.5 

1 

STLOCK 

1 

0 

0.5640 

0.9882 

YSCSTR 

0 

0 

0.3760 

0.9552 

START 

0.5 

1 

0.5 

1 

Table  7-5.  Flip-flop  and  primary  output  1-probabilities  and  entropies  for  the 

multiplexer-based  circuits. 


Figure  7-10  shows  the  fault  coverage  curves  for  the  baseline  circuit.  Version  I,  and  the 
multiplexer-based  variation.  Version  II.  As  was  the  case  with  the  circular  BIST  experi¬ 
ments,  we  see  that  enhancement  of  flip-flop  entropy  improves  fault  coverage,  but  not  a 
great  deal.  Again,  we  move  to  the  second  step  of  the  testability  enhancement  technique 
of  Figure  7-2,  to  take  a  closer  look  at  what  is  happening  within  the  combinational 
logic.  1-probabilities  and  entropies  for  Version  II  of  the  circuit  are  shown  in  the  first 
two  columns  of  Table  7-3.  The  values  show  that  the  trouble  is  at  the  same  node  as 
before;  the  output  of  G4  has  low  entropy. 
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Figure  7-10.  Fault  coverage  curves  for  the  multiplexer-based  circuits. 


gate  or  node 

Version  II 

Version  III 

1-prob. 

entropy 

1-prob. 

entropy 

MUXO  to  MUX3 

0.5000 

1 

0.5000 

1 

GO  to  G3 

0.5000 

1 

0.5000 

1 

G4  (observed) 

0.1563 

0.1162 

0.1563 

0.1162 

G4  (fanouts) 

— 

— 

0.5000 

1 

G5 

0.003906 

0.03688 

0.1250 

0.5436 

G6 

0.001953 

0.02039 

0.0625 

0.3373 

G7 

0.001953 

0.02039 

0.0625 

0.3373 

G8 

0.2529 

0.8159 

0.3438 

0.9284 

G9 

0.25 

0.8113 

0.2500 

0.8113 

GIO 

0.1279 

0.5517 

0.2188 

0.7579 

g//(stlcx:k) 

0.5640 

0.9882 

0.6094 

0.9652 

GI2  (START) 

0.5000 

1 

0.5000 

1 

YSCSTR 

0.3760 

0.9552 

0.4063 

0.9745 

Table  7-6.  Internal  node  1 -probabilities  and  entropies  for  the  multiplexer-based 

circuits. 
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For  the  multiplexer-based  methodology,  we  will  improve  the  entropy  at  the  output  of 
G4  by  slicing  the  circuit  at  this  node;  a  multiplexer  is  inserted  so  that  during  test,  gates 
G5,  G6,  and  G7  are  not  driven  by  G4,  but  are  instead  driven  directly  by  the  TPGR.  For 
this  approach,  the  output  of  G4  must  be  connected  directly  to  the  MISR  to  make  fault 
effect  observation  possible.  This  sliced  variation  of  the  circuit  is  designated  Version 
III.  Table  7-3  shows  that  internal  line  entropies  for  the  sliced  circuit  are  substantially 
higher  than  for  the  original.  Figure  7-10  shows  that  Version  ///achieves  higher  fault 
coverage  in  significantly  less  time  than  Version  II. 

As  was  the  case  with  the  circular  BIST-based  variations,  most  of  the  faults  remaning 
undetected  are  on  the  asynchronous  reset  lines.  Fault  coverage  curves  for  the  same  cir¬ 
cuits,  this  time  using  a  weighted  pseudorandom  reset,  are  shown  in  Figure  7- 10. 
Again,  we  see  that  if  the  flip-flops  of  a  circuit  already  have  reset  capabilities,  using  a 
pseudorandom  weighted  reset  can  be  an  effective  way  to  improve  flip-flop  entropy  ;  in 
this  case,  the  use  of  a  pseudorandom  weighted  reset  raised  the  achievable  fault  cover¬ 
age  of  Version  I  up  from  87%  to  almost  95%.  As  was  the  case  with  the  circular-BIST 
based  circuit  variations,  however,  a  pseudorandom  weighted  reset  alone  is  not  enough 
to  achieve  high  fault  coverage,  because  it  can  not  remove  the  random  pattern  resis¬ 
tance  that  causes  lines  internal  to  the  combinational  logic  to  have  low  entropy. 

Comparisons  of  area  and  performance  for  the  three  multiplexer-based  circuit  varia¬ 
tions  are  given  in  Table  7-7.  By  comparing  these  numbers  to  the  results  for  the  circular 
BIST  experiment,  we  see  that  the  multiplexer-based  test  point  insertion  requires  much 
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less  area  overhead  to  do  the  job.  However,  the  critical  delay  overhead  for  Version  III  is 
a  quite  substantial  22.8%.  The  reason  for  this  is  that  the  inserted  multiplexers  lie  on  the 
critical  path. 
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Figure  7-11.  Fault  coverage  for  the  multiplexer-based  circuits,  using  a 
pseudorandom  asynchronous  reset. 


circuit 

Area  in 

Transistor 

count 

Critical  delay 
in  ns 

_ 1 

Version  I 

530 

Version  II 

770  X  584.5 

575 

Version  III 

810x584.5 

589 

13.28 

overhead  for  Version  II 

7.6% 

8.5% 

12.9%  ! 

overhead  for  Version  III 

13.2% 

11.1% 

22.8% 

Table  7-7.  Area  and  performance  figures  for  the  multiplexer-based  variations. 


Work  by  [LiZB93]  points  out  that  it  is  possible  to  use  gates  rather  than  full  multiplex¬ 
ers  to  do  test  point  insertion.  For  our  example,  we  could  use  an  OR  gate  to  slice  the  cir¬ 
cuit  at  the  output  of  gate  G4.  One  input  of  the  OR  gate  is  driven  by  the  output  of  G4, 
and  the  other  is  driven  by  the  TPGR.  During  test  mode,  the  output  of  G4  almost 
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always  has  value  zero,  so  the  output  of  the  inserted  OR  gate  has  entropy  nearly  equal 
to  one.  During  normal  mode,  we  must  be  careful  that  the  TPGR  generates  a  zero,  so 
that  the  inserted  OR  gate  does  not  affect  the  functioning  of  the  circuit.  Lin’s  test  point 
insertion  will  have  substantially  less  impact  on  area  and  performance,  since  an  OR 
gate  is  both  smaller  and  faster  than  a  full  multiplexer. 


7.5  Summary 

Many  existing  circuit  designs  are  not  sufficiently  testable;  they  require  testability 
enhancement,  either  through  complete  re-synthesis  or  small,  local  design  modifica¬ 
tions.  When  re-synthesis  of  a  design  can  not  be  done,  the  only  alternative  is  to  try  to 
enhance  the  testability  of  the  existing  design.  The  proposed  technique  for  probabilistic 
analysis  of  sequential  gate-level  circuits  provides  a  systematic  approach  for  testability 
enhancement  of  sequential  circuits.  It  has  been  used  to  enhance  the  testability  of  a 
sample  submodule  of  an  industrial  design.  For  the  style  of  design  used,  i.e.,  for  rela¬ 
tively  small  sequential  circuits  interconnected  via  system  registers,  the  technique  is 
computationally  inexpensive,  and  helps  to  quickly  find  areas  of  random  pattern  resis¬ 
tance  in  the  design.  These  problem  areas  can  then  be  eliminated  with  simple  design 
modifications.  Resulting  modified  designs  enjoy  significantly  higher  fault  coverage 
than  the  original. 


Chapter  8 


BIST  Insertion  at  the  Register 
Transfer  Level 


Traditionally,  BIST  insertion  at  the  register  transfer  level  is  done  by  making  each  arith¬ 
metic  logic  unit  (ALU)  in  a  datapath  directly  testable;  controllable  registers  (TPGRs) 
are  placed  at  each  ALU’s  inputs,  and  an  observable  register  (MISR)  is  placed  at  each 
ALU’s  output.  However,  such  an  invasive  addition  of  test  registers  may  not  be  neces¬ 
sary.  For  example,  suppose  that  the  input  registers  to  an  ALU  are  not  directly  control¬ 
lable,  but  that  they  still  generate  patterns  that  are  random  enough  to  effectively  test  the 
ALU.  In  this  case,  there  is  no  need  to  replace  the  normal  system  registers  with  test  pat¬ 
tern  generation  registers;  actually,  to  do  so  is  often  undesirable,  since  TPGRs  are  larger 
and  slower  than  ordinary  registers.  Thus,  in  selecting  registers  for  BIST  insertion,  a 
trade-off  can  be  made,  with  test  quality  on  one  side  and  area  and  performance  on  the 
other.  This  chapter  explores  this  trade-off  for  register  transfer  level  datapaths. 
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8.1  Experimental  Set-Up 

In  this  chapter,  experiments  are  done  with  several  register  transfer  level  circuits.  Three 
testability  metrics  are  used:  randomness,  expected  state  coverage,  and  transparency. 
The  testability  metrics  are  computed  for  the  registers  of  the  circuits  using  the  Markov 
model  of  Chapter  Five,  and  decisions  about  where  to  place  test  registers  within  the  cir¬ 
cuits  are  based  on  the  metrics.  Fault  coverage  curves  show  the  effects  of  the  BIST 
insertion  decisions  on  the  testability  of  the  resulting  circuits.  These  experiments  serve 
two  purposes:  to  validate  the  Markov  model  by  showing  that  the  analytical  predictions 
are  close  to  actual  values  obtained  by  simulation,  and  to  show  how  the  metrics  can  be 
used  to  compare  different  BIST  configurations  in  terms  of  test  quality,  and  therefore  to 
guide  BIST  insertion. 

Analytical  predictions  for  the  testability  metrics  are  from  the  Markov  model;  computa¬ 
tion  times  for  the  Markov  model  analysis  are  for  a  SPARCstation  IPC  with  36  MB  of 
memory.  Actual  values  for  the  testability  metrics  are  obtained  from  a  computer  pro¬ 
gram  that  simulates  the  functioning  of  an  RTL  circuit  while  in  test  mode,  computing 
the  registers’  states  at  each  time  step.  The  simulation  uses  a  system  state  seed  of  all  ‘  T 
bits.  The  choice  of  system  state  seed  is  arbitrary  ;  any  seed  that  does  not  lead  too 
quickly  to  a  system  state  cycle  will  produce  similar  results. 

Layout  area,  transistor  count,  and  critical  delay  figures  are  given  so  that  the  trade-off 
between  test  quality  and  cost  can  be  better  understood.  All  are  derived  in  the  COM- 


147 


PASS  Design  Automation  suite  of  tools  [CODA92].  Layout  area  is  found  by  placing 
and  routing  the  circuit.  We  include  transistor  count  as  a  supplementary  measure  of 
area,  because  layout  area  depends  heavily  on  the  algorithm  used  for  the  routing  and  the 
manufacturing  process  used.  For  example,  layout  area  results  here  are  for  a  process 
with  two  levels  of  metallization;  if  three  or  four  levels  are  available,  the  layout  areas 
will  be  considerably  different.  The  critical  delay  for  the  circuit  is  the  delay  along  the 
slowest  combinational  path  in  the  circuit,  and  reflects  the  speed  at  which  the  circuit 
can  be  clocked.  It  includes  both  gate  delays  and  interconnect  delays  based  on  the  rout¬ 
ing.  For  all  area  and  performance  figures,  overhead  is  given  in  terms  of  a  version  of 
the  circuit  which  contains  the  minimal  amount  of  hardware  needed  for  the  circular 
self- test  path  technique;  overhead  for  a  circuit  is  calculated  with  the  following  for¬ 
mula: 


overhead  = 


V  -  V  . 
nun 


where  v  is  the  value  for  the  circuit  in  question,  and  Vjjjjn  is  the  corresponding  value  for 
the  “minimal”  circuit. 


Fault  coverage  curves  are  from  GENTEST,  an  AT&T  fault  simulator,  and  include  only 
those  faults  within  the  ALUs.  The  test  session  for  the  circuit  begins  by  configuring  the 
circular  BIST  test  registers  into  a  scan  chain,  and  shifting  in  the  desired  system  state 
seed.  This  initialization  phase  is  not  included  in  the  fault  coverage  curves.  Tradition¬ 
ally,  the  signature  is  obtained  by  observing  one  element  of  the  circular  self-test  path 


for  a  limited  number  of  clocks  at  the  end  of  the  test  session.  However,  we  found  this 


difficult  to  implement  in  GENTEST;  instead,  we  observed  the  element  throughout  the 
entire  test  session.  As  a  result,  our  experiments  neglect  some  of  the  probability  of 
aliasing  during  circuit  response  compaction. 

In  these  examples,  the  values  of  the  testability  metrics  are  used  to  decide  where  to 
insert  test  registers.  Low  randomness  can  have  an  adverse  effect  on  test  quality.  At 
best,  low  randomness  values  require  the  use  of  a  longer  test  session  to  adequately  test 
the  ALUs  in  the  circuit;  in  some  cases,  however,  even  an  increased  test  length  will  not 
help  to  generate  an  adequate  number  of  different  test  patterns.  Expected  state  coverage 
can  help  distinguish  between  these  two  cases.  Low  transparency  can  also  adversely 
affect  test  quality  by  making  it  difficult  to  observe  the  effect  of  faults,  as  discussed  in 
Chapter  Four. 


8.2  Example  One:  A  Cascade 

The  first  example,  shown  in  Figure  8-1,  is  a  four  bit  wide  cascade  of  adders  and  multi¬ 
pliers  that  performs  the  arithmetic  function  F  =  {ab)  {cd)  +  {ef)  {g  •¥  h)  .  Testabil¬ 
ity  metrics,  both  from  analysis  and  simulation,  are  shown  in  Table  8-1  for  two  different 
circular  self-test  path  choices.  Part  (a)  of  the  table  corresponds  to  a  version  of  the  cir¬ 
cuit  with  a  minimal  circular  self-test  path,  that  is,  with  a  path  containing  only  the  pri¬ 
mary  input  and  primary  output  registers;  for  this  circuit,  which  we  refer  to  as  Version  /, 
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the  circular  self-test  path  is  ( <=  REGS,  4=  REG7,  <=  REG6,  ^  REGS,  REG4, 

<=  REGS,  <=  REG2,  <=  REGl,  4=  REGIS),  where  the  order  shown  indicates  the 
order  of  the  registers  in  the  path,  and  <=  denotes  that  the  indicated  register  is  shifted  to 
the  left,  from  least  significant  bit  to  most  significant  bit.^  ^  Randomness  (MR), 
expected  state  coverage  (ESC)  in  96  clocks,  and  transparency  (MT)  of  all  registers  are 
shown. 


Figure  8- 1.  A  cascade  of  adders  and  multipliers. 


For  this  circuit  with  a  minimal  circular  self-test  path,  die  testability  metrics  of  some 
registers  not  in  the  path  is  less  than  ideal.  In  particular,  registers  REG13  and  REG14 
have  low  randomness,  and  registers  REG9,  REGIO,  REGll,  and  REG12  have  low 


1.  We  will  later  use  =»  to  denote  that  a  register  is  shifted  to  the  right. 

2.  For  each  example  in  this  chapter,  the  order  of  the  registers  in  the  circular  self-test 
path,  and  the  orientation  or  shift  direction  for  each  register,  were  chosen  to  avoid  bit 
correlation  problems  as  described  in  Chapter  Six. 
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REG  1-8 

REG9 

REG  10 

REGll 

_ 

REG12 

REG13 

REG  14 

REG15 

MR 

Analy. 

1 

0.9295 

0.9295 

1 

0.6389 

0.7915 

1 

Sim. 

0.9999 

0.9298 

0.9296 

0.9996 

0.6400 

0.7912 

0.9997 

ESC 
in  96 
elks 

Analy. 

0.9981 

0.9756 

0.9756 

0.9756 

0.9981 

0.7525 

0.8875 

0.9981 

Sim. 

1 

! 

0.9375 

0.9375 

1 

1 

0.7500 

1 

1 

MT 

Anafy. 

1 

0.7000 

0.7000 

0.8670 

0.7000 

1 

1 

1 

(a)  Version  /,  with  a  minimal  circular  self-test  path. 


MR 

Analy. 

1 

1 

1 

1 

1 

0.9295 

1 

Sim. 

0.9999 

0.9999 

0.9999 

0.9999 

0.9999 

0.9308 

0.9999 

ESC 

in  96 
elks 

Analy. 

0.9981 

0.9981 

0.9981 

0.9981 

0.9981 

0.9756 

0.9756 

0.9981 

Sim. 

1 

,  1 

1 

1 

1 

1 

1 

MT 

Analy. 

1 

1 

1 

1 

1 

1 

1 

1 

(b)  Version  11,  with  registers  REG9-12  added  to  the  path. 


Table  8-1.  Testability  metrics  for  the  cascade. 


transparency.  In  order  to  boost  these  low  values,  we  choose  to  insert  additional  BIST 
registers  in  place  of  registers  REG9,  REG  10,  REGl  1,  and  RjEG12;  for  Version  II  of  the 
circuit,  the  circular  self-test  path  is  (<=  REGS,  «=REG7,  <=REG6,  ^REG5, 
<=REG4,  <=REG3,  <:=REG2,  <=REG1,  <?=REG15,  <=REG12,  <=REG11, 

<=  REG  10,  <=  REG9).  Testability  metrics  for  this  variation  are  in  part  (b)  of  Table  8-1, 
and  are  all  improved. 


For  the  circular  BIST  methodologies,  the  insertion  of  a  test  register  boosts  both  con¬ 
trollability  and  observability;  that  is  why  our  primary  inputs  have  high  transparency. 
Note  also  that  although  we  replaced  registers  9  and  10  with  test  registers  for  the  pur¬ 
pose  of  improving  transparency,  it  also  brought  the  randomness  of  these  registers  up  to 
1.  This  in  turn  was  sufficient  to  bring  the  randomness  of  register  13  up  from  a  low 
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value  of  0.6389  to  the  more  reasonable  value  of  0.9295,  and  the  randomness  of  register 
14  up  from  0.7915  to  0.9295. 

Table  8-1  shows  a  good  agreement  between  the  analytical  values  predicted  by  our 
Markov  model  and  actual  values  obtained  from  simulation.  In  general,  we  expect  a 
closer  agreement  for  randonmess  than  for  expected  state  coverage,  because  actual  val¬ 
ues  for  expected  state  coverage  depend  heavily  on  the  choice  of  the  system  state  seed 
for  short  test  sessions.  Computation  of  the  testability  metrics  for  both  circuits  using  the 
Markov  model  required  a  total  of  1.03  CPU  seconds. 

Figure  8-2  shows  fault  coverage  curves  for  the  two  versions  of  the  circuit.  For  this 
example,  there  is  not  a  large  difference  in  the  attainable  fault  coverage;  Version  / 
leaves  only  one  fault  undetected,  and  Version  II  leaves  no  faults  undetected.  However, 
the  curves  show  that  the  Version  II  is  significantly  more  testable  than  Version  I  in  the 
sense  that  it  reaches  high  fault  coverage  much  more  quickly.  Version  //reaches  100% 
fault  coverage  in  105  clocks;  in  contrast,  Vfer5ion  /  requires  almost  three  times  as  much 
time,  300  clocks,  to  reach  its  top  fault  coverage  of  99.9%. 

Table  8-2  shows  the  overhead  involved  in  adding  the  new  test  registers  to  the  circuit.  It 
shows  area  and  performance  figures  for  die  two  versions  of  the  cascade.  From  the 
table,  we  see  that  changing  registers  9  through  12  into  circular  BIST  test  registers 
resulted  in  a  circuit  that  was  less  than  9%  bigger  than  the  original.  The  addition  also 


slowed  the  circuit  down  20.8%. 
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Figure  8-2.  Fault  coverage  curves  for  the  cascade. 


Area  in 

Transistor 

Count 

Critical  Delay 
inns 

Version  I 

2242  X  1407 

4140 

10.72 

Version  II 

2346 X 1415 

4492 

12.95 

overhead 

5.2% 

8.5% 

20.8% 

Table  8-2.  Area  and  performance  figures  for  the  cascade. 


In  practice,  a  trade-oflf  can  be  done  between  the  test  quality  and  the  impact  on  area  and 
performance,  hi  this  example,  we  could  have  chosen  to  insert  fewer  test  points;  for 
example,  we  may  have  decided  to  replace  registers  9,  10,  and  12  with  circular  BIST 
registers  to  boost  their  transparencies  of  0.7000  up  to  1,  but  to  leave  register  11,  with  a 
transparency  of  0.8670,  a  normal  register.  This  decision  would  have  had  less  of  an 
impact,  particularly  on  area,  and  would  also  have  resulted  in  a  circuit  somewhere 
between  Version  I  and  Version  II  in  terms  of  test  time  required  to  achieve  high  fault 


coverage. 
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8.3  Example  Two:  An  Arithmetic  Pipeline 

Our  second  example,  shown  in  Figure  8-3,  is  typical  of  an  arithmetic  pipeline,  and  is 

a  (0 

four  bits  wide.  It  computes  the  recurrence  relation  F  ( r  -i-  3)  =  +  c{t)F  {t+  1 )  , 

where  a,  b,  and  c  are  primary  input  values.  Testability  metrics  for  the  pipeline  are 
shown  in  Table  8-3.  Again,  the  first  version  has  a  minimal  circular  self- test  path;  in 
this  case,  the  path  is  ( <=  REG3,  «=  REG2,  <=  REG  1,  <=  REG7).  Version  I  has  good 
transparency  throughout;  this  is  because  the  use  of  the  circular  self- test  path  technique 
adds  transparency  to  the  inputs  of  the  circuit,  and  the  high  sensitivity  of  the  adder  to 
changes  in  its  input  makes  the  middle  of  the  circuit  highly  transparent.  One  register  of 
the  circuit,  REG4,  has  an  extremely  low  randomness  value  of  0.5883.  Version  II  of  the 
circuit  removes  this  problem  by  placing  REG4  in  the  circular  self-test  path  ( REG3, 
«=  REG2,  5=  REGl,  =>REG4,  <=  REG7).  Computation  of  the  testability  metrics  for 
both  circuits  using  the  Markov  model  required  a  total  of  1.44  CPU  seconds. 

Fault  coverage  curves  for  the  two  versions,  given  in  Rgure  8-4,  show  that  Version  IF s 
addition  of  a  single  test  register  causes  a  dramatic  improvement  in  fault  coverage; 
attainable  fault  coverage  goes  up  fi'om  91.8%  in  Version  I  to  99.8%  in  Version  II.  Table 
8-4,  which  gives  area  and  performance  figures  for  the  two  versions  of  the  pipeline, 
shows  that  the  price  paid  for  the  added  testability  is  a  less  than  7%  increase  in  area  and 


an  8.5%  increase  in  critical  delay. 


MR  Analysis  1  0.5883  0.8854  0.9771  1 

Simulation  0.9929-  0.5783  0.8835  0.9762  0.9927 

0.9930 


ESC  Analysis  0.9981  0.6381  0.9650  0.9944  0.9981 

(in  96  Simulation  I  0.5000  0.9375  1  I 

clocks) 

I  MT  Analysis  11  1  1  1 

(a)  Version  I,  with  a  minimal  circular  self-test  path. 

~MR  Analysis  [I  1  0.9295  1  1 

Simulation  0.9996  0.9980  0.9336  1  0.9967  0.9989 

ESC  Analysis  0.9981  0.9981  0.9756  0.9981  0.9981 

(in  96  Simula-  1  1111 

clocks)  tion 

MT  Analysis  1  1111 

(b)  Version  II,  with  REG4  added  to  the  path. 

Table  8-3.  Testability  metrics  for  the  arithmetic  pipeline. 
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Rgure  8-4.  Fault  coverage  curves  for  the  arithmetic  pipeline. 


Area  in 

Transistor 

Count 

Critical  Delay 
in  ns 

Version  I 

1602  X  969.5 

1944 

10.61 

Version  II 

2032 

11.51 

overhead 

6.4% 

4.5% 

8.5% 

Table  8-4.  Area  and  performance  figures  for  the  arithmetic  pipeline. 


8.4  Example  Three:  A  Low  Pass  Filter 

Our  third  example,  shown  in  Figure  8-5,  is  adapted  from  a  low  pass  filter  in 
[COOS93].  The  datapath  is  nine  bits  wide,  using  fixed  point  numbers  with  two  bits 
after  the  binary  point.  The  25%  and  75%  ALUs  multiply  their  single  inputs  by  0.25 
and  0.75,  respectively. 

Version  /  of  the  circuit  has  a  minimal  circular  self-test  path.  Testability  metrics  are 


shown  in  Table  8-5.  Because  this  circuit  has  chained  ALUs,  we  calculate  metrics  not 
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Figure  8-5.  A  low  pass  filter. 

only  at  the  registers,  but  also  at  points  a  and  b?  Computation  of  the  testability  metrics 
using  the  Markov  model  required  a  total  of  3.50  CPU  seconds.  For  this  example,  trans¬ 
parency  values  are  trivially  very  high,  because  addition  is  highly  sensitive  to  changes 
in  an  input,  but  the  randomness  of  point  b  is  quite  low  because  the  multiplication  by  ^ 
clears  the  two  most  significant  bits.  In  order  to  improve  the  randomness  at  b,  we  insert 
a  test  register  that  is  transparent  in  normal  mode.  Version  fl  of  the  circuit  inserts  a  full 
nine  bit  test  register  at  point  b.  For  this  example  only  two  of  the  nine  bits  actually  have 
low  entropy,  so  we  can  save  area  by  using  a  test  register  of  only  two  bits;  we  do  this  in 
Version  III.  Fault  coverage  curves  for  all  versions  of  the  circuit  are  shown  in  Figure 
8-6.  Note  that  Versions  II  and  III  are  virmally  indistinguishable  in  terms  of  fault  cover¬ 
age;  in  terms  of  test  quality,  enhancing  only  the  low  entropy  bits  of  point  b  is  suffi¬ 
cient.  Table  8-6  shows  the  area  and  performance  figures  for  all  three  versions  of  the 
low  pass  filter;  from  the  figures,  it  is  clearly  beneficial  to  use  the  smaller,  two  bit  test 
register  at  point  b,  since  the  area  overhead  is  much  smaller  for  Version  III  than  it  is  for 

3.  This  is  accomplished  by  creating  a  version  of  the  low  pass  filter  that  has  “pseudo-registers”  at  points 
a  and  by  and  applying  the  Markov  model  to  that  version  in  the  usual  way. 
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Version  II.  The  performance  overhead  for  Version  III  is  a  bit  larger  for  than  for  Version 
II.  The  difference  is  small,  and  due  to  the  simulated  annealing  algorithm  used  by 
COMPASS  to  do  routing;  there  is  an  element  of  randomness  in  the  algorithm,  so  if  the 
routing  were  done  a  second  time,  the  performance  numbers  might  easily  be  reversed. 


REGl 

REG2 

a 

b 

MR 

Analysis 

1 

1 . 

0.9444 

0.7778 

Simulation 

0.9860 

0.9870 

0.9351 

0.7750 

ESC  (in 

Analysis 

0.9975 

0.9975 

0.7488 

. 

0.2500 

3072  clocks) 

Simulation 

0.9961 

0.9980 

0.7500 

0.2520 

MT 

Analysis 

1 

1 

1 

1 

(a)  Version  /,  wit 

h  a  minimal  circular  self- test  path. 

MR 

Analysis 

1 

1 

1 

Simulation 

0.9850 

0.9855 

0.9878 

ESC  (in 

Analysis 

EBl 

0.9975 

3072  clocks) 

Simulation 

IBH 

mi 

MT 

Analysis 

1 

1 

1 

1 

( 

d)  Versions  II  and  III,  wit 

h  a  test  register  at  b. 

Table  8-5.  Testability  metrics  for  the  low  pass  filter. 


Figure  8-6.  Fault  coverage  curves  for  the  low  pass  filter. 
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Area  in 

Transistor 

Count 

Critical  Delay 
in  ns 

Version  I 

1402  X  725 

1506 

17.88 

Version  II 

1666  X  857.5 

2190 

18.05 

Version  III 

1522  X  765 

1658 

18.12 

overhead  for  Version  II 

40.5% 

45.4% 

1.0%  ! 

! 

overhead  for  Version  III 

14.5% 

10.1% 

1.3% 

Table  8-6.  Area  and  performance  figures  for  the  low  pass  filter. 


8.5  Example  Four:  An  Elliptical  Wave  Filter 

Our  final  example  is  derived  from  a  fifth  order  elliptical  wave  filter  commonly  used  as 
a  benchmark  for  high  level  synthesis  systems  [PaKn89].  First,  a  four  bit  wide  wave  fil¬ 
ter  was  synthesized  without  test  features  using  the  SYNTEST  high  level  synthesis  sys¬ 
tem  [HPCN92].  Next,  since  our  test  effort  focuses  on  the  arithmetic  logic  units 
(ALUs),  the  assumption  that  multiplexer  control  would  be  held  constant  during  testing 
was  made,  and  so  a  single  path  through  each  multiplexer  was  chosen.  In  effect,  this 
allowed  the  removal  of  all  multiplexers  and  some  registers  not  used  for  testing  from 
the  circuit.  The  resulting  circuit  is  shown  in  Figure  8-7,  and  contains  reconvergent 
fanout,  feedback  loops,  and  self-adjacency.  Next,  circular  BIST  features  were  added 
to  the  circuit.  For  this  example,  a  minimal  circular  self-test  path  was  added: 
(<=REG10,  <=REG11,  ^REG12,  <=REG13,  <=  REG4).  Testability  metrics  are 
shown  in  Table  8-7.  Computation  of  the  testability  metrics  using  the  Markov  model 
required  a  total  of  3.09  CPU  seconds.  In  this  case,  the  randomness  and  transparency 
values  for  all  the  registers  are  quite  high.  On  the  basis  of  the  high  testability  metrics,  it 
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Figure  8-7.  An  example  derived  from  an  elliptical  wave  filter. 

was  decided  that  it  was  not  necessary  to  add  test  features  to  any  more  of  the  registers. 
A  fault  coverage  curve  for  the  circuit,  given  in  Hgure  8-8,  shows  that  100%  fault  cov¬ 
erage  is  quickly  reached  for  the  circuit  with  a  minimal  circular  self- test  path.  Area  and 
performance  figures  are  given  in  Table  8-8. 

Note  that  different  choices  of  multiplexer  control  result  in  different  testability  metrics 
for  the  registers  of  the  circuit;  by  choosing  a  path  through  each  multiplexer,  we  are 
configuring  the  circuit  in  a  specific  way  during  test.  The  configuration  chosen  here 
required  only  a  minimal  circular  self- test  path,  but  another  configuration  may  require 
additional  test  registers,  resulting  in  a  larger  and  slower  circuit.  A  further  direction  of 
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Table  8-7.  Testability  metrics  for  the  wave  filter  example 
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Figure  8-8.  Fault  coverage  curve  for  the  wave  filter  example. 


Area  in 

Transistor 

Count 

Critical  Delay 
in  ns 

2186  X  1282.5 

3600 

12.75 

Table  8-8.  Area  and  performance  figures  for  the  wave  filter  example. 


research  might  involve  how  to  choose  paths  through  the  multiplexers  so  as  to  make  the 
ALUs  of  the  circuit  testable  with  a  minimum  of  area  and  performance  overhead. 


8.6  Summary 

The  examples  of  this  chapter  show  how  testability  metrics  can  be  used  to  guide  built- 
in  self-test  (BIST)  insertion  in  register  transfer  level  datapaths.  The  testability  metrics 
of  randomness,  expected  state  coverage,  and  transparency  are  used  to  implicitly  cap¬ 
ture  the  properties  of  a  successful  built-in  self-test.  Fault  coverage  results  show  that 
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BIST  insertion  done  on  the  basis  of  these  metrics  significantly  eases  the  testability 
problem  without  resorting  to  an  expensive  traditional  BIST  implementation,  which 
requires  that  test  registers  be  placed  around  each  arithmetic  logic  unit. 


Chapter  9 


BIST  Insertion  in  the 
Behavioral  Domain 


As  the  complexity  of  very  large  scale  integration  (VLSI)  design  grows,  design  synthe¬ 
sis  tools  have  been  developed  at  higher  and  higher  levels  of  design  abstraction.  Over 
the  last  ten  years  or  so,  there  are  have  been  many  projects  to  build  high  level  synthesis 
tools  that  take  as  input  a  behavior,  or  a  description  of  the  function  to  be  performed  by 
the  circuit  in  algorithm  form,  and  produce  a  structure,  usually  a  register  transfer  level 
(RTL)  implementation  of  the  behavior. 

Although  design  synthesis  tools  continue  to  move  to  higher  levels  of  design  abstrac¬ 
tion,  design-f or- testability  (DFT)  tools  have  not  kept  pace.  In  Chapter  Three,  we  out¬ 
lined  a  number  of  current  projects  dealing  with  testability  insertion  in  high  level 
synthesis,  based  on  both  BIST  and  automatic  test  pattern  generation  (ATPG).  These 
approaches  are  all  limited  in  some  way.  Most  incorporate  testability  by  imposing  their 
own  restricted  design  style;  for  example,  the  work  of  [PaCH91]  enhances  BIST  test¬ 
ability  by  removing  self-loops  in  the  datapath,  and  the  work  of  [DePo94]  enhances 
ATPG  testability  by  minimizing  the  number  of  large  feedback  loops.  Most  current 
high  level  synthesis  for  testability  approaches  lack  the  flexibility  to  coordinate  with 
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general  purpose  synthesis  tools,  and  many  lack  testability  analysis  capability  to  guide 
the  synthesis  process. 

This  chapter  presents  a  new  method  for  test  synthesis  in  the  behavioral  domain  based 
on  BIST.  Specifically,  we  have  developed  a  technique  for  BIST  insertion  in  behavioral 
design  descriptions  that  coordinates  through  a  common  VHSIC  hardware  description 
language  (VHDL)  interface  with  other  synthesis  tools  in  the  behavioral  and  structural 
domains.  Our  approach  operates  on  a  given  design  behavior,  expressed  in  behavioral 
VHDL,  that  describes  the  desired  behavior  of  the  design  during  normal  mode  opera¬ 
tion.  The  basis  of  our  approach  is  to  derive  a  test  behavior  from  the  design  behavior. 
The  test  behavior,  which  is  also  expressed  in  VHDL,  describes  the  behavior  of  the 
design  in  test  mode,  and  is  derived  by  fixing  any  testability  problems  in  the  design 
behavior.  Thus,  the  test  behavior  describes  in  essence  the  BIST  insertion  for  the 
design.  A  merging  of  the  normal-mode  design  behavior  and  the  test-mode  test  behav¬ 
ior  is  then  synthesized  to  produce  a  testable  design  with  inserted  BIST  structures. 

One  key  aspect  of  our  approach  is  the  use  of  behavioral  testability  metrics  that  quan¬ 
tify  the  controllability  and  observability  of  signals  embedded  within  a  behavior.  By 
using  these  metrics  to  quantify  the  testability  of  behaviors,  we  can  modify  behaviors 
before  design  synthesis  even  begins  so  as  to  ensure  that  the  resulting  circuit,  when  syn¬ 
thesized  by  a  general  purpose  high  level  synthesis  tool,  will  be  easily  testable  using  a 
simple  BIST  scheme.  It  should  be  noted  that  our  behavioral  insertion  can  employ  not 


165 


only  our  own  test  metrics  but  also  test  metrics  developed  by  others,  for  example,  those 
of  [PeKu94],  based  on  flowgraph  analysis  of  a  behavior. 

Another  key  aspect  of  our  research  has  involved  exploration  of  the  connection 
between  behavioral  testability  and  testability  of  physical  implementations  of  the 
behavior.  By  showing  what  ramifications  a  low  testability  signal  in  the  behavior  can 
have  on  testability  at  the  gate  level  once  the  behavior  is  synthesized,  we  provide  a  link 
between  our  behavioral  testability  metrics  and  structural  testability  of  the  synthesized 
circuit. 

Our  approach  goes  beyond  current  approaches  that  focus  exclusively  on  the  datapath 
by  developing  a  methodology  that  produces  circuits  that  are  completely  testable,  both 
datapath  and  controller  together.  Behavioral  insertion  is  used  to  enhance  the  testability 
of  the  datapath,  while  structural  insertion  is  used  for  die  controller.  The  fact  that  our 
approach  is  designed  to  be  used  with  any  general  purpose  high  level  synthesis  system 
adds  flexibility  to  our  approach. 

The  chapter  is  organized  as  follows.  Section  9. 1  describes  the  minimal  behavioral 
BIST  scheme  for  datapaths,  and  explains  the  rationale  behind  the  scheme.  Section  9.2 
shows  how  our  testability  metrics  are  used  as  the  basis  of  a  test  point  insertion  method¬ 
ology  in  the  behavioral  domain  to  improve  the  testability  of  the  datapath.  Section  9.3 
describes  the  structural  test  point  insertion  that  we  use  to  improve  the  testability  of  the 
controller  and  the  interface  between  the  datapath  and  the  controller.  Section  9.4 
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describes  our  overall  test  scheme,  which  uses  the  behavioral  test  scheme  to  test  the 
datapath  and  a  circular  BIST  scheme  to  test  the  controller.  Section  9.6,  Section  9.7, 
and  Section  9.8  provide  results  for  three  example  circuits,  and  Section  9.9  is  a  sum¬ 
mary. 


9.1  A  Behavioral  Test  Scheme 

Conventional  BIST  schemes  for  datapaths  partition  the  datapath  into  small  kernels, 
and  test  each  kernel  separately.  The  simplest  conventional  test  scheme  breaks  the  data¬ 
path  up  so  that  each  kernel  contains  a  single  functional  unit  (ALU).  Test  pattern  gener¬ 
ation  registers  (TPGRs)  are  placed  at  the  inputs  of  each  kernel,  and  multiple  input  shift 
registers  (MISRs)  are  placed  at  the  outputs  of  each  kernel,  so  that  each  ALU  is  tested 
separately.  Test  scheduling  determines  which  kernels  can  be  tested  in  parallel. 

Many  deterministic  or  automatic  test  pattern  generation  (ATPG)  based  approaches 
also  rely  on  partitioning  of  the  datapath.  For  example,  the  work  of  [BhJh94]  uses 
behavioral  information  to  find  a  test  environment  for  each  ALU  in  a  datapath  individu¬ 
ally.  By  setting  the  control  in  a  special  way  so  as  to  allow  test  vectors  to  be  justified  at 
the  inputs  to  the  ALU  and  test  responses  to  be  propagated  from  the  output  of  the  ALU 
to  the  primary  outputs,  this  method  in  effect  separates  the  ALU  from  its  behavioral 
environment  during  test. 
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In  contrast,  our  approach  tests  the  datapath  as  a  whole;  our  test  scheme,  which  we  call 
minimal  behavioral  BIST,  is  shown  in  Figure  9- 1  as  viewed  m  both  the  behavioral  and 
structural  domains.  In  the  behavioral  domain,  we  separate  the  signals  and  variables  of 
a  behavior  into  three  disjoint  subsets:  input  signals,  output  signals,  and  internal  vari¬ 
ables.  Our  scheme  makes  all  input  signals  directly  controllable  by  mapping  them  onto 
BIST  input  registers  in  the  register  transfer  level  (RTL)  circuit  implementation  of  the 
behavior,  and  makes  all  output  signals  directly  observable  by  mapping  them  onto 
BIST  output  registers  in  the  RTL.  The  scheme  is  minimal  in  the  sense  that  no  insertion 
is  done  within  the  behavior;  no  internal  variables  are  made  BIST.  In  the  structural 
domain,  this  means  that  the  datapath  is  tested  as  a  whole  by  placing  a  TPGR  across  all 
the  primary  inputs  (Pis)  to  the  datapath,  and  an  MISR  across  all  the  primary  outputs 
(POs).  During  test  mode,  as  the  TPGR  generates  test  patterns,  the  datapath  is  exercised 
according  to  its  behavior  for  which  it  was  synthesized. 

An  important  consequence  of  our  behavior-based  scheme  in  the  structural  domain  is 
that  we  exercise  every  single  coimection  and  component  in  the  circuit.  This  approach 
is  in  contrast  to  traditional  BIST  approaches;  in  approaches  that  focus  on  the  arith¬ 
metic  logic  units,  some  registers  and  multiplexers  may  not  be  part  of  any  kernel,  so 
that  some  components  may  be  neglected.  In  addition,  interconnections  between  ker¬ 
nels  may  be  neglected.  A  similar  limitation  exists  for  the  ATPG-based  work  of 
[BhJh94];  concentrating  on  the  arithmetic  logic  units  does  not  guarantee  high  fault 
coverage  for  the  overall  datapath,  so  if  high  enough  fault  coverage  is  not  obtained,  the 
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Figure  9-1.  Behavioral  and  structural  views  of  minimal  behavioral  BIST  scheme. 
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method  must  add  an  extra  step  that  focuses  on  the  registers  or  multiplexers  that  were 
missed.  By  exercising  the  datapath  according  to  the  behavior,  we  can  guarantee  that  all 
parts  of  our  datapath  will  be  thoroughly  exercised,  because  each  control  mode  of  each 
component  is  used  at  some  point  during  the  behavior.  For  example,  if  the  datapath 
contains  a  three-to-one  multiplexer,  each  of  the  three  paths  through  the  multiplexer 
will  be  used  at  some  point  if  the  datapath  is  exercised  according  to  the  behavior.  If  one 
of  the  paths  were  not  exercised  by  the  behavior,  then  a  three-to-one  multiplexer  would 
not  have  been  synthesized  in  the  first  place;  a  two-to-one  multiplexer  would  have  been 
used  instead. 

Exercising  the  datapath  circuit  according  to  the  behavior  for  which  it  was  synthesized 
even  while  testing  is  intuitively  appealing  for  a  number  of  reasons.  The  first  is  that  we 
eliminate  the  difficult  question  of  how  to  partition  the  circuit  into  kernels,  and  how  to 
schedule  the  testing  of  the  kernels.  Secondly,  the  control  signals  that  we  need  to  test 
our  datapath  are  almost  exactly  those  control  signals  that  we  use  for  our  datapath  in 
normal  mode.  This  means  that  our  test  controller  is  relatively  simple,  and  easily 
embedded  within  the  system  controller  with  a  minimum  of  area  overhead. 

Note  that  some  special  care  must  be  taken  in  the  design  of  the  components  to  be  sure 
that  each  component  is  testable  using  only  its  defined  control  modes.  For  example, 
suppose  that  a  three-to-one  multiplexer  is  implemented  with  two  control  mode  bits. 
The  multiplexer  has  three  control  modes,  but  four  possible  combinations  of  control 


170 


signals.  The  multiplexer  must  be  designed  such  that  the  fourth,  undefined  control 
mode  is  not  needed  to  test  the  multiplexer. 

The  minimal  behavioral  BIST  scheme  will  yield  high  fault  coverage  for  a  datapath 
only  if  the  datapath  can  be  successfully  tested  by  applying  pseudorandom  patterns  at 
its  primary  data  inputs.  Behavioral  testability  insertion  is  used  to  modify  the  design 
behavior  so  that  the  synthesized  datapath  will  meet  this  requirement,  regardless  of  the 
high  level  synthesis  tool  used.  In  order  to  describe  the  testability  of  signals  embedded 
within  a  behavior,  we  use  the  randomness  and  transparency  metrics  defined  in  Chapter 
Four.  In  the  next  section,  we  describe  our  procedure  for  behavioral  testability  inser¬ 
tion. 


9.2  Behavioral  Testability  Insertion 

This  section  describes  the  use  of  testability  metrics  in  the  behavioral  domain  to  pin¬ 
point  and  correct  testability  problems  in  behaviors. 

9.2.1  Basic  concepts 

We  do  testability  insertion  in  behaviors  by  means  of  a  new  concept  that  we  call  the  test 
behavior.  This  concept  is  based  on  the  addition  of  a  test  mode  in  the  behavioral 
domain  that  is  generally  different  from  the  normal  mode  behavior,  or  design  behavior. 
The  test  behavior  can  be  generated  by  applying  a  series  of  local  behavioral-for-test 
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transformations  to  the  design  behavior.  These  transformations  or  modifications  are 
driven  by  our  test  metrics  with  the  aim  to  enhance  the  testability  qualities  of  the  behav¬ 
ior. 

The  modified  behavior  is  capable  of  executing  both  the  design  behavior  and  the  test 
behavior.  The  concept  is  pictured  in  Figure  9-2.  A  behavioral  switch  determines  which 
of  the  two  behaviors  is  executed.  Since  the  modified  behavior  is  a  merging  of  the 
design  and  test  behaviors,  we  call  it  the  design-and-test  behavior.  When  the  design- 
and-test  behavior  is  synthesized,  the  result  is  a  circuit  that  can  be  run  either  in  normal 
mode,  or  in  a  mode  more  conducive  to  test.  Thus,  the  resulting  structure  does  not  have 
testability  problems. 

In  general,  behavioral-for-test  transformations  can  produce  test  behaviors  that  bear  lit¬ 
tle  resemblance  to  the  design  behavior.  For  example,  a  test  behavior,  after  it  is  synthe- 


Figure  9-2.  The  design-and-test  behavior  concept. 
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sized,  may  target  the  testing  of  all  ALUs  of  the  circuit  structure  in  parallel  by 
reconfiguring  all  registers  as  BIST  in  test  mode.  We  have  decided  to  restrict  our  trans¬ 
formations  such  that  the  test  behaviors  generated  are  very  close  to  the  design  behavior. 
The  rationale  is  that  such  test  behaviors  are  easily  generated  and  easily  synthesized 
with  the  design  behavior  in  a  unified  way  by  a  synthesis  system.  Furthermore,  with 
this  approach  the  test  controller  is  easily  embedded  in  the  system  controller. 


Design  Behavior 


Design-and-Test  Behavior 


Transformation  for  Transformation  for 

Controllability  Observability 


A  Data  Flow 
Graph 


Controllable 

point 


Figure  9-3.  Transformations  for  the  behavioral  test  scheme. 


Two  typical  behavioral-for-test  transformations  are  shown  in  Figure  9-3.  SELECT  and 
ESCAPE  serve  to  enhance  the  controllability  and  observability,  respectively,  of  inter¬ 
nal  behavioral  variables.  When  these  transformations  are  applied,  an  input  or  output 
BIST  signal  is  inserted  in  the  behavior;  in  the  structural  domain,  this  corresponds  to 
the  insertion  of  a  primary  input  or  output  register.  However,  in  this  scheme  no  internal 
BIST  variables  are  inserted  in  the  behavior.  This  means  that  any  binding  of  internal 
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variables  into  registers  by  the  synthesizer  will  not  be  affected  by  the  BIST  insertion; 
therefore  the  binding  can  be  done  independently  of  the  BIST  insertion.  The  inserted 
SELECT  nodes  are  implemented  using  multiplexers  that  bring  in  a  new  primary  input; 
ESCAPE  nodes  are  implemented  with  the  addition  of  new  primary  output.  Note  that 
these  new  primary  inputs  and  outputs  do  not  require  the  use  of  additional  I/O  pins; 
these  new  primary  inputs  and  outputs  are  connected  only  to  the  TPGR  and  the  MISR 
used  for  test. 

The  testability  insertion  process  begins  with  the  design  behavior,  and,  using  behavior- 
for-test-transformations,  modifies  it  to  produce  the  test  behavior.  The  goal  is  to  do  as 
few  modifications  as  are  necessary  to  produce  a  behavior  that  is  testable.  We  defme  a 
behavior  to  be  testable  if  all  its  internal  variables  have  randomness  values  above  the 
threshold  and  transparency  values  above  the  threshold  The  values  of  the 
thresholds  come  from  empirical  analysis  based  on  fault  coverage;  at  the  present,  we 
are  using  Rfgf^  =  0.79  and  Tt^h  =  0.40. 

In  what  follows,  we  first  describe  the  overall  testability  insertion  procedure.  Next,  we 
motivate  this  choice  of  procedure  with  an  example  illustrating  the  relationship 
between  behavioral  testability  and  testability  of  the  synthesized  structure.  Finally,  we 
end  this  section  by  describing  how  the  overall  testability  insertion  procedure  uses 
behavioral-for-test  transformations  to  fix  local  testability  problems.  A  complete  exam¬ 
ple  of  the  testability  insertion  procedure  is  given  with  the  results  in  Section  9.6. 
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9.2.2  Testability  insertion  procedure 

The  testability  insertion  process  is  iterative  in  nature,  and  uses  the  randomness  and 
transparency  metrics  to  pinpoint  testability  problems  in  the  behavior.  Low  randomness 
or  transparency  indicates  potential  controllability  or  observability  problems  that  may 
negatively  impact  the  testability  of  the  RTL  structure  synthesized  from  the  behavior. 

The  overall  analysis  and  insertion  procedure  is  shown  in  Figure  9-4.  It  begins  by  com¬ 
puting  the  randomness  values  of  all  signals  and  variables  within  a  behavior.  Those  sig¬ 
nals  and  variables  with  randomness  below  the  threshold  are  considered  candidates 
for  controllable  point  insertion.  Of  the  candidates,  one  nearest  the  primary  inputs  is 
selected:  this  is  because  controllable  point  insertion  will  affect  the  randomness  not 
only  of  the  insertion  point,  but  also  of  all  signals  and  variables  driven  by  the  insertion 
point.  The  exact  nature  of  the  insertion  done  depends  on  the  cause  of  the  low  random¬ 
ness  at  the  selected  signal  or  variable;  the  various  causes  of  low  testability  are 
described  in  the  last  part  of  this  section,  along  with  the  appropriate  insertions.  Once 
insertion  is  done  to  improve  randomness  at  the  selected  signal  or  variable,  the  random¬ 
ness  values  for  the  behavior  are  recomputed,  and  the  process  iterates  until  all  random¬ 
ness  values  are  above  the  threshold. 

The  analysis  defers  decisions  about  transparency  improvement  until  after  all  random¬ 
ness  improvement  has  been  done.  This  is  because  a  change  in  randomness  may  affect 
transparency,  but  a  change  in  transparency  has  no  effect  on  randomness.  (For  an  expla¬ 
nation  of  this,  see  page  56.)  Once  all  randomness  values  are  above  the  threshold 
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Figure  9-4.  The  behavioral  testability  insertion  procedure. 


transparency  values  are  computed  for  all  signals  and  variables,  in  the  behavior.  All  sig¬ 
nals  and  variables  with  transparency  below  the  threshold  are  considered  candi¬ 
dates  for  observable  point  Insertion.  Of  the  candidates,  one  closest  the  primary  outputs 
is  selected;  this  is  because  observable  point  insertion  will  affect  the  transparency  not 
only  of  the  insertion  point,  but  also  of  all  signals  that  drive  the  insertion  point.  As  was 
the  case  with  controllable  point  insertion,  the  exact  nature  of  the  insertion  depends  on 
the  cause  of  the  low  transparency.  Once  observable  pomt  insertion  has  been  done  to 
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improve  the  transparency  of  the  selected  signal  or  variable,  all  transparencies  are 
recomputed.  The  process  iterates  until  all  transparency  values  are  above  the  threshold. 

9.2.3  Behavioral  versus  structural  testability 

One  important  aspect  of  our  work  is  that  we  require  that  all  signals  and  variables  of 
our  final  test  behavior  have  testability  metrics  above  the  thresholds.  Figure  9-5  illus¬ 
trates  why  we  make  this  strict  requirement.  Part  (a)  of  the  figure  shows  two  addition 
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(a)  a  data  flow  graph  fragment. 


(b)  the  structure  created  when  op  1  and  op2 
are  bound  to  the  same  physical  adder. 
Figure  9-5.  Example  of  relationship  between  behavior  and  structure. 


operations  from  a  data  flow  graph.  One  addition  (opl)  has  highly  random  data  inputs, 
while  the  other  (op2)  does  not.  Part  (b)  of  Figure  9-5  shows  a  physical  implementation 
obtmned  by  binding  operations  opl  and  op2  to  the  same  physical  adder.  From  the  point 
of  view  of  testing  the  adder,  there  is  no  need  to  boost  the  randomness  of  the  inputs  to 
the  second  operation.  The  good  quality  test  patterns  that  the  adder  receives  when  oper- 
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ation  opl  is  being  performed  are  adequate  to  test  the  adder,  so  it  does  not  matter  that 
the  test  patterns  received  when  operation  op2  is  being  performed  are  of  low  quality. 
However,  the  focus  of  our  approach  is  to  test  the  complete  datapath;  we  test  not  only 
the  arithmetic  logic  units,  but  also  the  multiplexers,  registers,  and  interconnections. 
Note  that  although  the  adder  in  Rgure  9-5  will  be  fully  tested,  the  input  multiplexers 
will  not;  without  some  kind  of  testability  insertion,  one  input  to  each  multiplexer  has 
low  randomness,  and  so  the  test  patterns  that  the  multiplexers  receive  are  of  low  qual¬ 
ity.  For  this  reason,  our  methodology  insists  that  insertion  be  done  to  enhance  every 
low  testability  signal  or  variable  in  a  behavior.  This  approach  has  another  benefit,  in 
addition  to  the  fact  that  it  allows  a  complete  test  of  the  datapath  as  a  whole;  it  also 
lends  itself  well  to  a  pre-synthesis  approach,  because  the  testability  insertion  needed  is 
independent  of  the  choices  made  during  binding.  This  means  that  our  testability  inser¬ 
tion  approach  does  not  require  a  special  high  level  synthesizer;  it  is  appropriate  for  use 
with  any  general  purpose  high  level  synthesis  tool. 

9.2.4  Three  causes  of  low  testability 

This  subsection  describes  how  insertion  is  done  to  improve  the  testability  of  a  signal  or 
variable.  Suppose  that  a  signal  or  variable  has  low  testability,  which  we  define  as  hav¬ 
ing  either  randomness  below  the  threshold  or  transparency  below  the  threshold 
Tfsh-  In  some  cases,  a  SELECT  or  ESCAPE  transformation  is  used  directly  at  the  point 
of  unacceptable  testability  to  improve  the  controllability  or  observability.  In  other 
cases,  it  is  more  beneficial  to  do  insertion  at  a  signal  or  variable  slightly  removed  from 
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the  unacceptable  point.  We  now  use  examples  to  show  the  three  possible  causes  of  low 
testability  and  the  proper  insertion  for  each. 

Figure  9-6(a)  shows  low  randomness  at  a  variable  due  to  an  inherent  inability  of  a 
function  to  transfer  randomness.  Although  the  inputs  to  the  integer  divider  are  com¬ 
pletely  random,  the  nature  of  division  is  such  that  the  output  has  poor  randomness.  In 
this  case,  there  is  no  way  to  remove  the  point  of  low  randomness  entirely,  but  we  can 
make  sure  that  the  low  randonmess  signal  is  not  used  to  provide  test  patterns  for  any 
part  of  the  circuit.  Thus,  if  our  original  design  behavior  contains  a  structure  like  that  of 
Figure  9-6(a),  we  use  a  SELECT  transformation  to  provide  direct  control  of  the  low 
randomness  variable  downstream  of  the  divider  during  test.  In  doing  so,  we  must  also 
use  the  ESCAPE  transformation  to  insert  an  observable  point,  so  that  we  can  observe 
the  response  of  the  behavior  upstream  of  the  divider;  otherwise,  this  part  of  the  behav¬ 
ior  will  “dead-end”  when  the  circuit  is  in  test  mode.  Rgure  9-6(b)  shows  the  proper 
test  behavior  for  this  example.  The  design-and-test  behavior  will  behave  as  shown  in 
part  (a)  of  Figure  9-6  when  in  design  mode,  and  as  shown  in  part  (b)  of  the  figure  when 
in  test  mode.  Note  that  while  the  low  randomness  variable  still  exists  in  the  data  flow 
graph,  it  no  longer  provides  test  patterns  to  any  part  of  the  behavior. 

Figure  9-7(a)  shows  low  randomness  at  a  variable  due  to  correlation  among  the  inputs 
to  the  operation  driving  the  variable.  The  figure  shows  a  multiplier  being  used  to  com¬ 
pute  the  square  function;  the  output  of  the  multiplier  has  low  randomness  because  the 
square  function  has  a  limited  output  space.  In  this  case,  it  is  most  beneficial  to  use  a 
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(new  PO) 


(a)  original  data  flow. 


(b)  data  flow  after  insertion  of  a 
controllable/observable  pair. 


Figure  9-6.  Insertion  when  functionality  causes  low  randomness. 


SELECT  transformation  to  break  the  correlation  by  providing  a  new,  independent  sig¬ 
nal  at  one  operation  input  during  test  mode.  The  data  flow  of  the  transformed  behavior 
in  test  mode  is  shown  in  Figure  9-7 (b).  Note  that  by  breaking  the  correlation,  all  low 
randomness  variables  are  removed,  and  all  operations  receive  high  quality  test  pat¬ 
terns.  There  are  two  reasons  for  using  insertion  to  break  the  correlation,  rather  than 
doing  the  insertion  directly  at  the  point  of  low  randomness,  as  we  did  in  the  previous 
example.  The  first  is  that  by  removing  the  correlation,  we  can  improve  the  testability 
metrics  by  inserting  a  single  controllable  point,  rather  than  the  controllable/observable 
pair  needed  in  the  previous  example.  The  second  stems  from  the  fact  that  the  correla¬ 
tion  makes  it  difficult  to  test  the  multiplier.  Although  in  the  original  behavior  the  two 
inputs  to  the  multiplier  have  high  randomness  values,  their  high  degree  of  correlation 
severely  restricts  the  test  patterns  received  by  the  multiplier. 
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(a)  original  data  flow. 


(b)  data  flow  after  insertion  of  a 
controllable  point. 

Figure  9-7.  Insertion  when  correlation  causes  low  randomness. 


Figure  9-8(a)  shows  low  transparency  at  a  variable  that  is  due  neither  to  functionality 
nor  correlation.  Here,  the  low  transparency  is  caused  by  gradual  degradation  of  trans¬ 
parency  as  we  move  up  the  data  flow  graph.  The  primary  output  of  the  graph  is 
directly  observable,  and  therefore  has  transparency  one.  One  level  up,  the  transparency 
drops  to  a  value  of  0.50;  although  this  value  is  above  the  threshold,  it  is  low  enough  to 
cause  unacceptable  transparency  one  level  beyond  that.  In  this  case,  it  is  most  benefi¬ 
cial  to  use  an  ESCAPE  transformation  to  do  the  insertion  at  the  variable  of  medium 


transparency,  as  shown  in  Figure  9-8(b);  in  this  way,  all  transparencies  are  brought 
above  threshold  with  a  single  additional  observation  point.  In  contrast,  if  insertion  is 
done  directly  at  the  low  testability  points,  two  observable  points  are  needed.  Also, 
when  insertion  is  done  directly  at  a  low  testability  point,  the  low  testability  point  is  not 
completely  eliminated;  this  is  shown  in  Figure  9-9.  The  insertion  of  the  observable 
point  adds  a  fanout  branch  to  the  behavior.  While  the  source  of  the  fanout  branch  has 
enhanced  observability,  one  branch  still  has  low  observability.  At  the  register  transfer 
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level,  this  can  correspond  to  a  multiplexer  with  poor  observability,  as  shown  in  part  (c) 
of  Figure  9-9. 


(a)  original  data  flow. 


(b)  data  flow  after  insertion  of  an 
observable  point. 

Figure  9-8.  Insertion  when  gradual  degradation  causes  low  transparency. 


9.5  Structural  Testability  Insertion  for  the  Controller 

While  it  is  possible  to  evaluate  the  testability  of  the  datapath  in  the  behavioral  domain, 
we  do  insertion  for  the  controller  and  the  interface  between  the  controller  and  the  data¬ 
path  in  the  structural  domain.  The  controller  is  implemented  as  a  finite  state  machine, 
as  shown  in  Figure  9- 10.  We  improve  the  testability  of  the  controller  by  adding  circu¬ 
lar  BIST  capabilities  to  the  state  flip-flops.  During  test,  the  state  flip-flops  linked 
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(a)  original  data  (b)  data  flow  after  implementation, 

flow.  insertion. 

Figure  9-9.  Observable  point  insertion  at  the  point  of  low  transparency. 
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Figure  9-10.  Finite  state  machine  implementation  of  the  controller. 
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together  into  a  chain  that  can  be  configured  as  either  a  circular  BIST  register  or  an 
ordinary  serial  shift  register. 

In  addition,  we  do  some  structural  test  point  insertion  at  the  interface  between  the  con¬ 
troller  and  the  datapath.  The  interface  consists  of  the  signals  created  by  the  controller 
to  control  the  datapath,  and  includes  both  multiplexer  select  controls  and  register  load 
controls  for  the  datapath.  The  interface  is  tested  while  the  datapath  is  tested  using  the 
behavioral  test  scheme  described  in  Section  9.1.  Thus,  the  interface  lines  are  observed 
through  the  datapath  while  the  datapath  is  operating  in  its  design  mode. 

We  decide  where  test  point  insertion  is  necessary  in  the  interface  by  devising  observ¬ 
ability  metrics  for  the  interface  control  lines.  A  problem  on  a  multiplexer  select  line  is 
quite  easy  to  see,  as  it  causes  the  datapath  to  operate  on  incorrect  data  during  one  or 
more  control  steps.  For  this  reason,  we  assign  all  multiplexer  select  lines  transparency 
values  of  one. 

A  problem  on  a  register  load  line  can  be  more  difficult  to  detect.  For  example,  suppose 
that  a  register  load  line  is  stuck  active,  so  that  the  register  is  inadvertently  loaded  at 
each  control  step.  This  wiU  only  be  noticed  if  the  inadvertent  loads  corrupt  data  before 
the  data  can  be  used,  i.e.,  if  the  register  is  asked  to  hold  data  for  more  than  one  control 
step  and  is  unable  to  do  so  because  of  the  inadvertent  loads.  Thus,  the  transparency  of 
the  register  load  lines  depends  on  the  pattern  of  reads  and  writes  for  the  registers.  The 
transparency  metrics  are  computed  by  combining  structural  information  about  how  the 
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signals  and  variables  of  the  behavior  are  bound  to  the  registers  of  the  datapath  with 
behavioral  information  about  when  the  signals  and  variables  are  written  and  read  in 
the  scheduled  data  flow  graph.  We  use  the  scheduled  data  flow  graph  fragment  of  Fig- 
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load  R2  =  load  (x)  OR  load  (z) 


Figure  9-11.  A  scheduled  data  flow  graph  fragment. 


ure  9- 1 1  as  an  example.  First,  all  the  signals  and  variables  of  the  scheduled  data  flow 
graph  are  divided  into  two  classes:  those  that  are  used  only  in  the  control  step  after 
which  they  are  created,  and  those  which  are  used  two  or  more  control  steps  after  they 
are  created.  In  our  example,  variables  w,  y,  and  z  are  of  the  first  class,  and  x  is  of  the 
second  class.  If  all  signals  and  variables  bound  to  a  register  are  of  the  first  class,  the 
load  line  for  that  register  is  assigned  a  transparency  value  of  zero.  If  at  least  one  signal 
or  variable  bound  to  a  register  is  of  the  second  class,  the  load  line  for  that  register  is 
assigned  a  transparency  value  of  one.  Suppose  that  in  our  example,  w  and  y  are  bound 
to  one  register  R j,  and  jc  and  z  are  bound  to  another  register  /?2.  The  transparency  value 
of  is  then  zero,  while  the  transparency  value  of  /?2  is  one.  The  reason  that  it  is 
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impossible  to  see  a  stuck-active  fault  on  the  load  line  for  register  is  that  Rj  is  never 
asked  to  hold  its  value. 

For  the  register  load  lines  with  transparency  values  of  zero,  structural  observable  point 
insertion  is  done.  An  additional  primary  output  is  added  to  the  datapath  to  watch  one 
bit  of  the  register,  so  that  inadvertent  loads  can  be  seen.  For  the  example,  we  should 
watch  one  bit  of  register  Rj.  This  brings  the  transparency  of  the  load  line  for  register 
Rj  up  from  a  value  of  zero  to  a  value  of  one,  so  that  we  can  see  whether  the  register’s 
load  line  is  working  properly. 


9.4  Overall  Test  Scheme 

Our  goal  is  to  ensure  testability  of  the  complete  physical  implementation  of  a  behavior, 
including  both  datapath  and  controller,  and  the  interface  between  the  two,  as  shown  in 
Figure  9-12.  We  accomplish  the  overall  test  by  using  three  test  sessions.  The  first  test 
session  is  designed  to  catch  faults  on  the  reset  lines  of  the  state  flip-flops;  these  faults 
can  prevent  proper  initialization  of  the  state,  causing  the  circuit  to  behave  unpredict- 
ably.  This  short  test  session  configures  the  state  flip-flops  in  the  controller  in  a  serial 
shift  chain.  It  starts  by  shifting  in  a  deterministic  pattern  that  is  the  complement  of  the 
reset  state.  Then,  it  resets  the  flip-flops,  and  shifts  out  the  state.  In  this  way,  it  is  possi¬ 
ble  to  detect  if  any  state  flip-flop  does  not  reset  properly. 


data  inputs 
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The  second  test  session  uses  the  behavioral  test  scheme  to  detect  faults  within  the  data¬ 
path,  which  consists  of  registers,  arithmetic  logic  units,  and  multiplexers.  During  this 
session,  we  operate  the  controller  in  normal  mode,  so  that  it  creates  the  proper  control 
signals  to  operate  the  datapath  according  to  the  behavior.  We  then  use  a  TPGR  to  pro¬ 
vide  test  patterns  for  the  datapath,  and  an  MISR  to  observe  the  outputs  of  the  datapath. 
Note  that  it  is  necessary  to  mn  the  datapath  in  both  design  and  test  mode;  while  only 
test  mode  is  necessary  to  test  the  main  part  of  the  datapath,  it  is  necessary  to  use  design 
mode  to  fully  test  the  multiplexers  that  implement  the  behavioral  switch  between  the 
two  modes.  For  this  reason,  we  drive  the  behavioral  mode  control  (test  control  signal) 
from  the  TEKjR. 

The  third  test  session  is  designed  to  detect  faults  in  the  combinational  logic  of  the  con¬ 
troller.  This  includes  both  the  logic  that  determines  the  next  state  of  the  controller  from 
the  current  state,  and  the  logic  that  decodes  the  proper  control  signals  from  the  current 
state.  For  this  test  session,  we  configure  the  state  register  of  the  controller  into  a  circu¬ 
lar  BIST  register.  The  shift  input  of  the  register  is  driven  by  the  TPGR,  and  the  shift 
output  of  the  register  is  observed  by  the  MISR  [Davi94].  All  outputs  of  the  controller 
are  observed  by  the  MISR,  and  all  inputs  of  the  controller,  including  the  status  and 
external  status  lines,  are  driven  by  the  TPGR. 

The  order  of  the  test  sessions  is  not  important,  and  was  chosen  for  simplicity  of  imple¬ 
mentation.  We  chose  to  test  the  reset  faults  first  because  doing  so  requires  a  determin¬ 
istic  pattern,  and  it  was  easy  to  design  our  TPGR  so  that  the  deterministic  pattern  we 
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need  is  shifted  out  first.  Circuitry  on  chip  is  used  to  create  the  proper  control  signals  to 
move  from  test  session  to  test  session. 


9.5  Background  for  Experiments 

The  rest  of  this  chapter  is  devoted  to  examples  of  our  behavioral  BIST  insertion 
scheme.  For  each  example,  we  start  with  a  design  behavior  written  in  VHDL.  All  of 
the  examples  presented  in  this  chapter  use  four  bit  wide  signals.  We  apply  our  behav¬ 
ioral  BIST  insertion  procedure  to  the  design  behavior  to  derive  a  design-and-test 
behavior.  Next,  we  synthesize  the  design-and-test  behavior,  following  the  steps  of  the 
normal  ASIC  design  flow  (see  Figure  2-10  on  page  24).  We  also  synthesize  the  origi¬ 
nal  design  behavior  directly,  for  the  purposes  of  comparison.To  underscore  the  point 
that  our  BIST  insertion  procedure  is  not  designed  specifically  for  use  with  a  particular 
high  level  synthesis  system,  we  use  two  different  systems,  the  SYNTEST  high  level 
synthesis  system  developed  at  Case  Western  Reserve  University  [HPCN92]  and  the 
Behavioral  Design  Assistant  (BdA)  developed  at  the  University  of  California  at  Urv- 
ine  [RaGa94];  thus,  we  are  able  to  conclude  that  the  success  of  our  method  does  not 
rely  on  using  a  specific  style  of  design. 

When  high  level  synthesis  is  complete,  we  have  a  register  transfer  level  datapath  and 
control  flow  in  the  form  of  a  state  diagram.  Logic  level  synthesis  is  done  using  the 
COMPASS  Design  Automation  suite  of  tools  [CODA92],  using  a  finite  state  machine 
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implementation  for  the  controller.  At  this  point,  structural  circular  BIST  insertion  is 
done  within  the  controller  for  the  design-and-test  version,  as  described  in  Section  9.3. 
In  addition,  the  testability  metrics  for  the  interface  control  signals  are  computed,  and 
additional  test  points  are  added  where  necessary  to  improve  the  observability  of  the 
interface. 

Fault  coverage  curves  are  found  for  the  resulting  logic  level  circuits  using  AT&T’s 
GENTEST  fault  simulator,  and  the  curves  for  the  design-and-test  versions  are  com¬ 
pared  to  curves  for  the  original  design.  The  probability  of  aliasing  within  the  MISRs  is 
neglected.  Although  the  datapath  and  the  controller  are  tested  together  as  a  unit,  the 
fault  coverage  results  are  separated  into  two  curves;  this  is  done  to  make  clear  how 
much  of  the  fault  coverage  increase  comes  from  the  behavioral  insertion  in  the  datap¬ 
ath,  and  how  much  comes  from  the  structural  insertion  in  the  controller. 

In  order  to  understand  the  cost  of  our  method,  we  supply  area  and  performance  figures 
for  both  the  design  and  design-and-test  versions  of  the  examples.  Both  area  and  perfor¬ 
mance  are  derived  from  a  layout  level  version  of  the  circuits,  synthesized  from  the 
logic  level  using  the  COMPASS  Design  Automation  suite  of  tools.  Area  is  expressed 
as  a  transistor  count.  Performance  is  captured  by  the  critical  delay,  which  is  the  delay 
along  the  slowest  combinational  path  in  a  circuit.  Critical  delay  determines  the  top 
speed  at  which  a  circuit  can  be  clocked.  For  all  area  and  performance  figures,  over¬ 
head  is  given  by: 
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overhead  = 


where  vj&t  is  the  value  for  the  design-and-test  version  of  the  circuit,  and  is  the  cor¬ 
responding  value  for  the  original  design. 


9.6  Example  One:  A  Polynomial  Evaluator 

We  demonstrate  our  iterative  method  for  deriving  a  testable  behavior  from  an  original 

behavior  on  an  example  that  evaluates  the  third  degree  polynomial 
3  2 

ax  +bx  +  ex  +  .  The  first  step  of  our  method  requires  the  computation  of  the  ran¬ 
domness  of  the  variables  in  the  behavior;  the  randomness  values  are  shown  in  Figure 
9- 13(a),  superimposed  on  a  data  flow  graph  for  the  polynomial.  One  randomness 
value,  at  m2,  falls  below  the  threshold  of  =  0.79.  Controllable  point  insertion  is 
used  to  improve  the  randomness  at  m2,  in  this  case,  since  the  low  randomness  is 
caused  by  correlation  between  the  inputs  of  m2,  we  use  insertion  to  break  the  correla¬ 
tion,  in  exactly  the  same  way  described  in  Rgure  9-7  of  Section  9.2.  Rgure  9- 13(b) 
shows  the  data  flow  graph  after  the  insertion;  for  this  graph,  all  randomness  values  are 
above  the  randomness  threshold. 

At  this  point,  the  method  considers  the  transparency  metrics.  Figure  9-  14(a)  is  the 
same  data  flow  graph  as  Figure  9- 13(b),  but  this  time  the  transparency  values  for  each 
signal  and  variable  are  shown.  Two  signals,  at  a  and  the  leftmost  branch  of  x,  fall 
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(a)  before  controllable  point 
insertion. 


(b)  after  controllable  point  insertion. 


Figure  9-13.  The  data  flow  graph  for  the  polynomial  evaluator  with  annotated 

randomness  values. 


intermediate  behavior 


test  behavior 


(a)  before  observable  point  insertion.  (b)  after  observable  point  insertion. 

Figure  9-14.  The  data  flow  graph  for  the  polynomial  evaluator  after  controllable 
point  insertion  with  annotated  transparency  values. 
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below  the  threshold  of  '^tsh  “  ®  .40.  Note  that  it  is  important  to  consider  branches  sepa¬ 
rately,  as  they  have  different  transparency  properties.  This  time,  the  cause  of  the  low 
testability  is  a  gradual  degradation  of  transparency,  like  that  described  in  Figure  9-8  of 
Section  9.2.  The  transparency  at  si  and  ml  is  above  the  threshold,  but  low  enough  to 
adversely  affect  the  transparency  further  up.  Here,  we  improve  the  transparency  by 
inserting  an  observable  point  at  sL  The  final  data  flow  graph  for  the  test  behavior  is 
shown  in  Figure  9- 14(b),  with  all  transparency  values  above  the  threshold.  Note  that 
since  observable  point  insertion  does  not  affect  randomness  values,  this  final  data  flow 
graph  also  has  all  randonmess  values  above  the  threshold  (as  shown  in  Figure 
9- 13(b)). 

The  actual  behavior  synthesized  is  a  merging  of  the  original  behavior  (the  design 
behavior)  of  Figure  9- 13(a)  and  the  derived  testable  behavior  (the  test  behavior)  of 
Figure  9- 14(b).  This  merged  behavior,  the  design-and-test  behavior,  is  shown  side  by 
side  with  the  original  in  Figure  9-15.  The  triangles  in  the  design-and-test  behavior 
denote  divergence  and  merging  of  control.  Which  way  control  flows  after  the  diverge 
node  depends  on  the  value  of  the  test  mode  input;  one  path  is  for  the  design  behavior, 
and  the  other  path  is  for  the  test  behavior.  VHDL  descriptions  for  the  original  design 
behavior  and  the  design-and-test  behavior  are  shown  in  Figure  9- 16;  the  diverge  and 
merge  nodes  of  the  data  flow  graph  become  an  i/statement  in  VHDL. 

The  design-and-test  behavior  of  Figure  9-16  was  synthesized  into  a  datapath  at  the 
register  transfer  level  (RTL)  and  a  control  flow  graph  (state  diagrams)  using  the  SYN- 
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Figure  9-15.  Scheduled  data  flow  graphs  for  the  polynomial  evaluator. 
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ENTITY  example  IS 

ENTITY  example  IS 

PORT  (a,  b.  c,  d. 

PORT  (a,  b,  c,  d,  x. 

x:  IN  BIT^VECTORO  DOWNTO  0); 

newx:  INBIT_VECTOR(3  DOWNTO  0); 

out:  OUT  BIT_VECTOR(3  DOWNTO  0) 

test;  IN  BIT; 

out. 

); 

newout:  OUT  BIT_VECTOR(3  DOWNTO  0) 

END  example; 

);  ! 

END  example;  I 

i 

ARCHITECTURE  behavior  OF  example  IS 

ARCHITECTURE  behavior  OF  example  IS 

BEGIN 

BEGIN  i 

PROCESS  (a,  b,  c,  d  x) 

PROCESS  (a,  b,  c,  d,  x,  newx,  test)  I 

VARIABLE  M 1,  M2,  M3,  M4. 

VARIABLE  M 1 ,  M2,  M3,  M4, 

S1,  S2,  S3;  INTEGER; 

S1,S2,  S3:  INTEGER; 

BEGIN 

BEGIN 

Ml  :=  a  *  x; 

Ml  :=  a  *  x; 

SI  :=M1  +b; 

SI  :=M1  +b; 

newout  <=  S 1 ; 

IF  (test  =  ‘0’)  THEN 

M2  :=  X  *  x; 

M2  :=x*x; 

ELSE 

M2  :=  X  *  newx; 

END  IF; 

M3:=S1*M2; 

M3  :=  SI*  M2; 

M4  :=c  *  x; 

M4:=c*x; 

S2  :=  M4  +  d; 

S2:=M4  +  d;  | 

S3  :=  S2+M3; 

S3:=S2  +  M3; 

out  <=  S3; 

out  <=  S3; 

END  PROCESS; 

END  PROCESS; 

END; 

END; 

(a)  design  version,  i.e.,  as  originally 

(b)  design-and-test  version,  i.e.,  after 

designed. 

behavioral  testability  insertion. 

Rgure  9-16.  The  polynomial  evaluator  behaviors,  written  in  VHDL. 


TEST  high  level  synthesis  system  [HPCN92].  The  original  design  behavior  was  also 
synthesized  for  the  sake  of  comparison.  The  RTL  datapath  for  the  design-and-test 
behavior  is  almost  identical  to  the  datapath  for  the  design  behavior.  The  datapath  for 
the  design-and-test  version  is  shown  in  Figure  9- 17,  with  the  new  elements  in  bold. 
The  design-and-test  version  has  two  additional  registers,  one  each  for  the  inserted  con¬ 
trollable  and  observable  points,  and  an  additional  flip-flop  to  hold  the  test  mode  input. 
In  addition,  one  of  the  three-to-one  multiplexers  in  the  original  datapath  was  changed 
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to  a  four-to-one  multiplexer  to  pass  the  new  controllable  point,  newx,  through  to  the 
multiplier. 


(to  controller) 


f  , 

; 

register 


mux 


ALU 


Rgure  9- 17.  The  datapath  for  the  testable  polynomial  evaluator  behavior  as 
synthesized  by  SYNTEST,  with  inserted  elements  in  bold. 


The  control  flow  graphs  for  the  design  and  design-and-test  controllers  are  also  very 
similar,  and  are  shown  in  Rgure  9-18.  For  the  original  design,  which  has  no  condition¬ 
als,  the  control  flows  in  a  straight  line  from  control  step  0,  when  the  primary  inputs  are 
read,  to  control  step  6,  when  the  final  output  is  written.  For  the  testable  design,  the 
only  difference  is  a  branch  in  control  step  2  that  allows  the  controller  to  implement 
either  m2  =  x  •  x  for  design  mode  operation  or  m2  =  x  ■  newx  for  testable  mode 
operation. 

At  this  point,  the  synthesis  was  completed  to  the  logic  level,  and  structural  circular 
BIST  insertion  was  done  within  the  controller.  In  addition,  consideration  of  the  test¬ 
ability  metrics  for  the  interface  control  signals  necessitated  the  addition  of  some 
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Figure  9-18.  Control  flow  graphs  for  the  polynomial  evaluator. 
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observable  points  at  the  interface  between  the  datapath  and  the  controller.  This 
required  watching  one  bit  each  of  the  register  that  stores  a,  the  register  that  stores  test, 
and  the  register  that  stores  mj  and  m^.  Fault  coverage  results  for  the  datapath  and  con¬ 
troller  are  shown  in  Figure  9- 19.  The  results  show  that  the  circuit  synthesized  from  the 
design-and-test  behavior  is  substantially  more  testable.  Attainable  fault  coverage  rises 
in  the  datapath  as  a  result  of  the  behavioral  test  point  insertion,  and  in  the  controller  as 
a  result  of  the  structural  test  point  insertion.  Further,  fault  simulation  shows  that  the 
test  scheme  described  provides  very  high  fault  coverage  for  the  overall  circuit,  both 
datapath  and  controller  together. 


Time  in  clocks 


6 

O 

u 

> 

O 

U 


(a)  for  the  datapath,  (b)  for  the  controller. 

Rgure  9- 19,  Fault  coverage  curves  for  the  polynomial  evaluator  as  synthesized  by 

SYNTEST, 


Area  and  performance  figures  for  the  SYNTEST  versions  of  the  polynomial  evaluator 
are  shown  in  Table  9- 1,  The  increase  in  area  for  the  datapath  is  due  to  the  addition  of 
two  registers  to  hold  mwx  and  newout,  the  addition  of  a  flip-flop  to  hold  the  test  mode. 
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circuit 

transistor  count 

critical  delay  in  ns 

datapath 

controller 

datapath 

controller 

design 

1890 

446 

13.13 

6.35 

j  design-and-test 

2227 

699 

13.35 

7.93 

overhead 

18% 

50% 

1 _ 

2% 

1 

25% 

Table  9- 1.  Area  and  performance  figures  for  the  polynomial  evaluator  as 


synthesized  by  SYNTEST. 


and  the  replacement  of  a  two-to-one  multiplexer  with  a  larger  three-to-one  multiplexer 
(see  the  bolded  elements  of  Figure  9- 17).  The  increase  in  critical  delay  is  due  to  the 
fact  that  the  three-to-one  multiplexer  has  a  larger  delay  than  the  original  two-to-one 
multiplexer. 


In  some  cases,  area  can  be  saved  in  the  testable  datapath  by  not  bringing  in  a  com¬ 
pletely  new  controllable  point;  often,  a  controllable  signal  already  in  the  datapath  can 
be  re-used.  For  this  example,  it  is  possible  to  re-use  the  signal  d  instead  of  adding  a 
new  signal  newx-,  when  this  is  done,  the  area  overhead  for  the  testable  datapath  goes 
down  to  9%,  with  no  degradation  in  attainable  fault  coverage.  This  is  particularly 
effective  for  this  design  style,  for  which  each  primary  input  requires  its  own  register. 


The  increase  in  area  and  delay  for  the  controller  is  due  to  the  circular  BIST  insertion. 
The  overhead  for  this  example  is  quite  steep,  with  a  50%  increase  in  area  and  a  25%  in 
critical  delay.  Part  of  the  area  overhead  problem  is  that  the  controllers  for  these  kinds 
of  datapaths  are  very  small;  in  this  case,  the  controller  has  just  three  state  bits  and 
about  twenty  gates,  so  the  circular  BIST  functionality  adds  a  substantial  percentage. 
Note  that  the  controller  is  only  a  small  percentage  of  the  overall  circuit;  for  this  four 
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bit-wide  example,  it  is  less  than  ^  the  size  of  the  datapath.  Further,  if  an  eight  bit-  or 
sixteen  bit-wide  version  of  the  datapath  were  synthesized,  the  controller  would  not 
change,  so  the  controller  would  become  even  a  smaller  percentage  of  the  whole.  Thus, 
overhead  in  the  controller  is  not  as  large  a  concern  as  overhead  in  the  datapath. 


Hgure  9-20.  The  datapath  for  the  testable  polynomial  evaluator  behavior  as 
synthesized  by  BdA,  with  inserted  elements  in  bold. 


Next,  the  polynomial  evaluator  experiment  was  repeated,  this  time  using  the  Behav¬ 
ioral  Design  Assistant  (BdA)  to  perform  high  level  synthesis  instead  of  SYNTEST. 
The  resulting  datapath  for  the  design-and-test  version  is  shown  in  Figure  9-20,  with 
the  elements  inserted  to  enhance  testability  highlighted  in  bold.  The  design  style  is  sig¬ 
nificantly  different  from  that  used  by  SYNTEST;  latches  are  used  instead  of  registers, 
and  there  are  no  memory  elements  at  the  primary  inputs  of  the  datapath.  In  addition, 
the  multiplexers  used  by  BdA  are  different  from  those  used  by  SYNTEST;  the  BdA 
multiplexers  use  one-hot  encoding  on  their  select  lines.  Figure  9-2 1  shows  fault  cover- 
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age  curves;  like  the  SYNTEST  version,  there  is  significant  improvement  for  both  the 
datapath  and  the  controller.  Area  and  performance  figures  for  the  BdA  version  of  the 
polynomial  evaluator  are  shown  in  Table  9-2. 


Time  in  clocks 


(a)  for  the  datapath.  (b)  for  the  controller. 

Figure  9-21.  Fault  coverage  curves  for  the  polynomial  evaluator  as  synthesized 

by  BdA. 


circuit 

transistor  count 

critical  delay  in  ns 

datapath 

controller 

datapath 

controller 

design 

561 

12.54 

8.61 

design-and-test 

921 

12.66 

11.18 

overhead 

8% 

64% 

1% 

30% 

Table  9-2.  Area  and  performance  figures  for  the  polynomial  evaluator  as 

synthesized  by  BdA. 
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9. 7  Example  Two:  A  Differential  Equation  Solver 

Figure  9-22  shows  the  data  flow  for  a  second  example  that  implements  a  differential 


Rgure  9-22.  The  data  flow  graph  for  the  differential  equation  solver,  with 
annotated  randomness  /  transparency  pairs. 


equation  solver,  a  standard  high  level  synthesis  benchmark  [GDWL92].  ^  The  anno¬ 
tated  randomness  and  transparency  values  show  that  for  this  behavior,  no  behavioral 
test  point  insertion  need  be  done;  all  testability  values  are  above  the  thresholds.  Thus, 
the  difference  between  the  original  and  testable  versions  for  this  example  is  the  struc- 

1.  The  arithmetic  logic  units  marked  multiply  their  single  input  by  three. 
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tural  insertion  done  within  the  controller  and  at  the  interface  between  the  controller 
and  the  datapath.  Fault  coverage  curves  for  the  design  and  design-and-test  versions  of 
the  differential  equation  solver  are  shown  in  Figure  9-23.  The  difference  in  fault  cov¬ 
erage  between  the  design  datapath  and  the  design-and-test  datapath  is  due  to  the  addi¬ 
tion  of  some  test  points  at  the  interface  between  the  controller  and  datapath  to  increase 
the  observability  of  the  interface,  according  to  the  rules  given  in  Section  9.3  (see  Rg- 
ure  9-11). 


Time  in  clocks 


(a)  for  the  datapath. 


(b)  for  the  controller. 


Figure  9-23.  Fault  coverage  curves  for  the  differential  equation  solver  as 

synthesized  by  SYNTEST. 


Area  and  performance  figures  for  the  differential  equation  solver  are  given  in  Table 
9-3;  since  no  insertion  is  done  within  the  datapath,  there  is  zero  overhead  for  that  part 
of  the  circuit.  The  overheads  for  the  controller  are  very  similar  to  those  in  the  polyno¬ 


mial  evaluator  example. 
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circuit 

transistor  count 

critical  delay  in  ns 

datapath 

controller 

datapath 

controller 

design 

2373 

522 

12.98 

6.44 

i  design-and-test 

2373 

728 

12.98 

7.84 

1  1 

overhead 

1 

0% 

39% 

0% 

22%  i 

Table  9-3.  Area  and  performance 


igures  for  the  differential  equation  solver  as 


synthesized  by  SYNTEST. 


9.8  Example  Three:  The  Facet  Example 

Our  final  example  is  another  high  level  synthesis  benchmark  called  the  facet  example 
[GDWL92].  Its  operations  include  addition,  multiplication,  division,  and  the  logical 
operations  AND  and  OR.  Figure  9-24  and  Figure  9-25  show  the  steps  of  the  behavioral 
test  point  insertion.  Part  (a)  of  Figure  9-24  shows  the  data  flow  graph  for  the  facet 
example  with  randomness  values  annotated  on  each  edge.  Two  edges,  the  ones  coming 
from  the  division  and  the  logical  AND  operations,  have  randomness  values  below  the 
threshold  of  0.79.  Of  these  two  candidate  edges,  the  one  closest  the  primary  inputs  is 
selected  for  improvement.  In  this  case,  the  cause  of  low  randomness  is  an  inherent 
inability  of  the  integer  divider  to  transfer  the  randomness  at  its  inputs  to  its  output.  Fol¬ 
lowing  the  testability  insertion  guideline  of  Figure  9-6,  one  controllable  /  observable 
point  pair  was  inserted  to  slice  the  data  flow  graph  at  the  output  of  the  divider.  The 
new  data  flow  graph  is  shown  in  Figure  9-24(b)  with  new  randomness  values.  Note 
that  the  insertion  was  enough  to  bring  the  randomness  at  the  output  of  the  logical  AND 
operation  up  above  the  threshold. 
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At  this  point,  we  are  done  with  the  randomness  part  of  the  behavioral  test  point  inser¬ 
tion  procedure,  and  we  move  on  to  consider  transparency.  Figure  9-25(a)  shows  the 
same  data  flow  graph  as  Figure  9- 24(b),  this  time  annotated  with  transparency  values. 
Two  signals  have  transparency  below  the  threshold  of  0.40;  here,  the  cause  is  a  gradual 
degradation  in  transparency.  An  observable  point  is  inserted  at  a  node  of  medium 
transparency  to  fix  the  problem;  the  resulting  data  flow  graph  is  shown  in  Rgure 
9-25(b),  with  all  transparency  values  above  the  threshold.  At  this  point,  we  are  done 
with  our  behavioral  test  point  insertion  procedure,  and  the  data  flow  graph  of  Figure 
9-25(b)  is  our  derived  test  behavior. 

design  behavior  intermediate  behavior 
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intermediate  behavior 


test  behavior 


(a)  before  observable  point  insertion.  (b)  after  observable  point  insertion. 

Figure  9-25.  The  data  flow  graph  for  the  facet  example  with  annotated 

transparency  values. 


The  resulting  fault  coverage  curves  for  the  facet  example  synthesized  using  SYN- 
TEST  are  shown  in  Figure  9-26.  There  is  a  substantial  gain  in  fault  coverage  for  the 
datapath  due  to  the  behavioral  test  point  insertion;  there  is  also  a  substantial  gain  for 
the  controller  due  to  the  structural  circular  BIST  insertion.  Area  and  performance  fig¬ 
ures  for  the  SYNTEST  version  are  shown  in  Table  9-4.  The  facet  example  was  also 
run  using  BdA  instead  of  SYNTEST  for  the  high  level  synthesis.  Fault  coverage 
curves  for  this  experiment  are  shown  in  Figure  9-27,  with  area  and  performance  fig¬ 
ures  in  Table  9-5.  The  BdA  version  shows  results  similar  to  the  SYNTEST  version, 
despite  the  difference  in  design  style.  Area  and  performance  overheads  for  both  the 
SYNTEST  and  BdA  versions  are  similar  to  those  obtained  for  other  examples. 
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0  50  100 

Time  in  clocks 


(a)  for  the  datapath.  (b)  for  the  controller. 

Figure  9-26.  Fault  coverage  curves  for  the  facet  example  as  synthesized  by 

SYNTEST. 


circuit 

transistor  count 

critical  delay  in  ns 

datapath 

controller 

datapath 

controller 

design 

2927 

11.74 

5.59 

design-and-test 

3410 

717 

11.86 

7.98 

overhead 

17% 

76% 

1% 

43% 

Table  9-4.  Area  and  performance  figures  for  the  facet  example  as  synthesized  by 

SYNTEST. 


9.9  Summary 

In  conclusion,  we  have  presented  a  methodology  for  testability  insertion  in  datapath  / 
controller  pairs.  The  methodology  uses  behavioral  analysis  and  insertion  to  enhance 
the  testability  of  the  datapath,  and  structural  analysis  and  insertion  to  enhance  the  test¬ 
ability  of  the  controller  and  the  interface  between  the  datapath  and  the  controller.  The 
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(a)  for  the  datapath. 


Time  in  clocks 
(b)  for  the  controller. 


Figure  9-27.  Fault  coverage  curves  for  the  facet  example  as  synthesized  by  BdA. 


circuit 

transistor  count 

critical  delay  in  ns 

datapath 

controller 

datapath 

controller 

design 

1780 

474 

6.77 

design-and-test 

1980 

856 

10.67 

overhead 

11% 

81% 

1 _ 

0% 

58% 

Table  9-5.  Area  and  performance  figures  for  the  facet  example  as  synthesized  by 

BdA. 


combination  allows  us  to  achieve  high  fault  coverage  for  the  overall  circuit,  both  data¬ 
path  and  controller  together. 


Chapter  10 


Concluding  Remarks 


This  dissertation  develops  methodologies  for  BIST  insertion  across  the  design  hierar¬ 
chy.  It  demonstrates  that  despite  the  vast  differences  among  circuits  at  different  levels 
of  design  abstraction,  the  basic  principles  of  testability  remain  the  same.  Regardless  of 
the  level  of  design  abstraction,  a  good  test  relies  on  the  ability  to  deliver  good  quality 
patterns  to  the  circuit  elements,  and  to  propagate  the  effect  of  faults  to  the  primary  out¬ 
puts.  Thus,  the  testability  concepts  that  apply  at  the  gate  level  are  equally  applicable  at 
the  register  transfer  and  algorithmic  levels. 

Furthermore,  regardless  of  the  level  of  design  abstraction  at  which  testability  enhance¬ 
ment  is  done,  a  balance  must  be  achieved,  with  test  quality  on  one  side,  and  area  and 
system  performance  on  the  other.  To  be  seen  in  full  context,  a  gain  in  fault  coverage 
must  be  weighed  against  the  price  of  adding  extra  test  circuitry  to  the  chip,  and  the  per¬ 
formance  lost  when  the  system  must  be  slowed  down  because  test  circuitry  has 
intruded  on  the  critical  paths.  Throughout  this  dissertation,  we  make  this  trade-off 
clear  by  placing  fault  coverage  curves  side-by-side  with  layout  areas,  transistor  counts, 
and  critical  delays. 
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The  consideration  of  testability  can  begin  at  any  point  in  the  ASIC  design  flow.  It  is 
best  to  consider  testability  as  early  in  the  design  flow  as  possible.  Ideally,  this  means 
thinking  about  testability  before  any  kind  of  synthesis  begins,  when  a  design  descrip¬ 
tion  is  at  the  algorithmic  level  in  the  behavioral  domain.  In  Chapter  Nine,  a  methodol¬ 
ogy  for  adding  testability  to  behaviors  prior  to  high  level  synthesis  is  presented.  This 
method  for  behavioral  test  point  insertion  is  combined  with  a  behavioral  test  scheme 
that  provides  high  fault  coverage  for  the  datapath  portion  of  the  circuit.  Structural  test 
point  insertion  and  a  circular  BIST  scheme  are  added  for  the  controller  portion,  to  pro¬ 
vide  a  high  quality  test  of  the  overall  datapath  /  controller  pair. 

A  major  strength  of  our  methodology  at  the  algorithmic  level  is  the  fact  that  it  does  not 
rely  on  the  use  of  a  specific  design  style.  Experiments  with  two  different  high  level 
synthesis  systems  show  significant  gains  in  fault  coverage  for  two  different  design 
styles.  Since  the  method  is  generic  with  regards  to  design  style,  it  can  be  used  to  aug¬ 
ment  an  existing  synthesis  process  with  a  minimum  of  fuss.  Another  strength  of  the 
behavioral  test  scheme  presented  in  Chapter  Nine  is  the  fact  that  the  control  signals 
needed  for  the  datapath  during  test  are  almost  the  same  as  the  ones  used  in  normal 
mode.  This  means  that  the  test  controller  can  be  easily  embedded  within  the  design 
controller,  with  minimal  changes  to  the  control  state  diagram. 

Although  it  is  best  to  consider  testability  as  early  as  possible  in  the  design  flow,  it  is 
not  always  possible  to  start  in  the  behavioral  domain.  High  level  synthesis  from  behav¬ 
ioral  descriptions  is  a  fairly  new  concept,  and  many  circuits  are  still  designed  starting 
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at  the  register  transfer  level.  In  Chapter  Eight,  we  present  a  testability  insertion  meth¬ 
odology  appropriate  for  these  circuits.  Examples  using  the  circular  self-test  path  tech¬ 
nique  show  that  by  using  testability  metrics  to  choose  where  to  place  test  points,  we 
can  significantly  improve  testability,  either  in  terms  of  attainable  fault  coverage  or  the 
number  of  test  patterns  needed  to  achieve  high  fault  coverage. 

There  are  times  when  even  a  register  transfer  level  analysis  can  not  be  done.  Some  cir¬ 
cuits  are  inherently  gate  level,  and  must  be  treated  as  such.  Chapter  Seven  deals  with 
testability  insertion  in  gate  level  circuits.  There,  the  goal  is  to  enhance  the  testability  of 
the  circuit  while  modifying  the  circuit  as  little  as  possible.  We  show  two  variations  on 
the  testability  insertion  methodology,  one  based  on  circular  BIST  and  the  other  on  test 
point  insertion  using  multiplexers.  In  both  cases,  the  insertion  is  used  to  systematically 
provide  good  quality  pseudorandom  test  patterns  to  the  combinational  logic  of  the  cir¬ 
cuit,  and  to  combat  random  pattern  resistance  within  the  combinational  logic.  The 
methodology  is  demonstrated  on  a  submodule  of  an  industrial  design;  both  the  circular 
BIST  and  the  test  point  variations  show  significant  improvement  in  fault  coverage. 

Whether  dealing  with  gate  level,  register  transfer  level,  or  algorithmic  level  circuits, 
there  are  some  structural  details  that  must  be  handled  carefully  when  the  methodology 
of  choice  is  circular  BIST  or  the  circular  self- test  path  technique.  Chapter  Six  outlines 
two  problems  that  can  occur,  both  due  to  bit  level  correlation.  The  first,  register  adja¬ 
cency,  is  a  by-product  of  the  order  of  the  circular  BIST  test  flip-flops  within  the  path. 
The  second  is  a  kind  of  correlation  inherent  in  the  shifting  nature  of  the  path,  and  was 
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identified  first  in  this  work.  When  present  in  a  circuit,  these  two  kinds  of  correlation 
can  severely  degrade  test  quality.  Identification  of  these  problems  is  a  necessary  and 
major  step  towards  being  able  to  do  circular  BIST  and  circular  self-test  path  insertion 
automatically. 

The  cornerstone  to  the  BIST  insertion  methodologies  of  Chapter  Seven,  Eight,  and 
Nine  is  the  set  of  testability  metrics  presented  in  Chapter  Four.  These  metrics  play  a 
major  role  in  the  circular  self-test  path  correlation  analysis  of  Chapter  Six  as  well.  Our 
controllability  metrics,  randonmess  and  expected  state  coverage,  were  borrowed  from 
previous  research;  our  testability  metric,  transparency,  is  new  to  this  work.  The 
Markov  chain  model  used  to  calculate  the  metrics,  detailed  in  Chapter  Five,  is  an 
advance  over  the  earlier  models  on  which  it  was  based  in  terms  of  the  way  that  it  mod¬ 
els  correlation  due  to  reconvergent  fanout  and  indirect  feedback. 

It  is  important  to  note  that  although  we  chose  in  this  work  to  use  randomness,  expected 
state  coverage,  and  transparency  to  quantify  the  testability  of  BIST  circuits,  the  BIST 
insertion  procedures  presented  do  not  rely  on  the  use  of  a  specific  set  of  testability 
metrics.  Any  appropriate  set  of  testability  metrics  may  be  used.  As  seen  in  Chapter 
Three,  the  survey  of  related  research,  there  has  been  a  great  deal  of  interest  in  high 
level  testability  metrics  recently,  all  with  a  different  emphasis,  or  designed  to  help 
attain  a  different  goal.  We  believe  that  by  changing  the  underlying  metrics,  our  test¬ 
ability  insertion  procedures  can  be  easily  modified  for  other  purposes.  For  example, 
we  believe  that  the  insertion  procedures  need  not  be  limited  to  BIST;  with  an  appropri- 
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ate  set  of  metrics,  they  could  be  used  for  automatic  test  pattern  generation  (ATPG)- 
based  testability  insertion  as  well.  We  plan  to  test  this  theory  using  the  algorithmic 
level  methodology. 

We  also  plan  to  do  some  additional  work  in  the  area  of  testing  controllers.  In  our  algo¬ 
rithmic  level  work,  we  use  circular  BIST  to  augment  the  testability  of  our  finite  state 
machine  controllers,  inserting  the  circular  BIST  at  the  gate  level.  We  would  like  to 
move  the  consideration  of  testability  up  to  a  higher  level  of  design  abstraction.  It  is  dif¬ 
ficult  to  see  how  testability  insertion  could  be  moved  all  the  way  up  to  the  algorithmic 
level,  since  there  is  no  concept  of  a  controller  until  the  datapath  is  created  during  high 
level  synthesis.  However,  it  may  be  possible  to  do  testability  insertion  directly  after 
high  level  synthesis,  while  the  controller  is  in  the  form  of  a  state  diagram.  Two  recent 
projects  at  other  universities  have  looked  at  modifying  state  diagrams  to  make  control¬ 
lers  more  easily  tested  using  automatic  test  pattern  generation  [ChKA92]  [KaCA93] 
[KaCA95]  [PoRe93].  We  would  like  to  explore  the  properties  of  controllers  that  are 
important  to  a  BIST  scheme,  and  perhaps  to  develop  a  similar  methodology  that  is 
appropriate  to  BIST,  and  that  blends  well  with  the  behavioral  BIST  scheme  we  are  cur¬ 
rently  using  to  test  the  datapath  portion  of  the  circuit.  The  goal  of  such  a  methodology 
would  be  to  improve  on  the  area  overheads  that  are  necessary  to  do  a  full  circular 
BIST  insertion  in  the  controllers;  in  Chapter  Nine,  we  saw  that  these  overheads  ranged 


from  50%  to  81%  for  our  examples. 
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